MathML-in-HTML5

51 views
Skip to first unread message

r...@maths.uq.edu.au

unread,
Sep 23, 2006, 9:57:55 AM9/23/06
to dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
I am currently driving an effort to enable MathML-in-HTML (apart from
MathML-in-XHTML that we already support). I have a patch that serves
the dual purpose of showing where things are going and the issues to
ponder about.

Here is a
[screenshot] https://bugzilla.mozilla.org/attachment.cgi?id=239771
which is a _live_ rendering of this testcase:
[mathml-in-html] https://bugzilla.mozilla.org/attachment.cgi?id=239769

Those interested in following this up can see bug 353926:
https://bugzilla.mozilla.org/show_bug.cgi?id=353926

Quick background:
=================

At the Firefox engineering meeting in Mountain Views (last December
2005), I pleaded that we enable MathML in HTML5 to advance the cause
of MathML, which is so far locked in a XHTML/XML world that does not
seem to be going anywhere in terms of display content as opposed to
data (witness the WHATWG effort -- http://www.whatwg.org). Those to
whom I spoke included dbaron, hixie and sicking, and they welcomed the
suggestion, asking for a broader discussion. Hixie raised the caveat
that MathML elements should still remain in the MathML namespace. He
e-mailed me a while ago about a discussion on this matter in the
WHATWG mailing list, which can be seen here
http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-June/thread.html.

That discussion is however too broad and involves tangential issues such as
inventing another syntax, etc. My original take was simply to enable
MathML+HTML, in the same vein as we have MathML+XHTML. I think MathML
is suffering from having to fight the battle for adoption of XHTML as
well. As a niche technology, it does not have the means to be engaging
a fight. What it simply needs is MathML-in-HTML. W3C failed to
recognise that it could retrofit MathML in HTML -- see this archived
post for some insight:
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source
But HTML5 being shepherded by WHATWG could provide the right framework
from this to happen now.

I have finally been able to code this up (while keeping MathML
elements in the MathML namespace). I attached the patch I had so far
in bug 353926.

Design & Technical issues:
==========================

How does MathML-in-HTML5 work?

We support MathML-in-HTML5 when these two conditions are met:

1. The DOCTYPE of the document says so. If yes, we enable
MathML entities (TODO) and flag mMayHaveMathML in the HTML content sink.

2. And either a) OR b) is met:

a) <html> has the MathML namespace as the value of an attribute with a
prefix, e.g., <html xmlns:m="http://www.w3.org/1998/Math/MathML">.

In this case, we cache the prefix "m" in mMathMLNameSpacePrefix,
and we intercept all <m:tag> in the document and create
MathML content nodes for them.

b) MathML fragments are in the document as
<math xmlns="http://www.w3.org/1998/Math/MathML">
...
</math>

In this case, we intercept all non-HTML elements inside the <math> tag
and create MathML content nodes for them.

Issues:
1. Tag soup: we understand that we are exposing ourselves to this.

2. a) What about CSS matching rules? From the Style System point of view,
the document is still HTML, but <m:math> is in the MathML namespace. We
might have to special case MathML-in-HTML5 in the Style System as well.

b) The second option raises an issue with HTML-in-MathML, e.g.,
<math xmlns="http://www.w3.org/1998/Math/MathML">
<b>bold</b>
</math>
We don't intercept the <b> in this case. Hence, even though it is
HTML-in-MathML without an explicit XHTML namespace for <b>,
the HTML sink
will give <b> a HTML content node. This is not really XHTML friendly.
On the other hand, we don't want to be an XML parser either... These
are conflicting objectives. We need to decide what to do. We may agree
to only support tags with prefixes as in a), or also keep b) knowing
that it has this XHTML unfriendly behavior.
---
RBS

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


Ian Hickson

unread,
Sep 23, 2006, 5:06:07 PM9/23/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Sat, 23 Sep 2006 r...@maths.uq.edu.au wrote:
>
> Hixie raised the caveat that MathML elements should still remain in the
> MathML namespace.

I meant in the DOM, I didn't mean in the markup. I don't think we should
have any namespace declarations or namespace prefixes in text/html; I
would just have the HTML parser always support the MathML elements, in
the same way that it supports any random unknown element today, except
that when it sees a MathML element it puts it into the MathML namespace in
the DOM rather than the XHTML namespace.

I really don't think we want to introduce namespace prefixes or namespace
declarations into tag soup. I think that would be a big mistake.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Paul Topping

unread,
Sep 23, 2006, 6:38:52 PM9/23/06
to Ian Hickson, r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
If MathML is considered a subset of HTML5, then no namespace declaration
would be necessary. However, if MathML is going to work in HTML that
isn't declared as HTML5 (not clear to me from this thread), then the
document would be poorly specified without it, IMHO.

At the risk of enciting an anti-Microsoft backlash, I should remind some
on the list that IE has covered this territory before. They already have
a mechanism for declaring XML islands in HTML that seems to work just
fine. Of course, Mozilla won't be interested in duplicating IE's way of
associating a plugin as the renderer of the namespace in the document.
IMHO, it doesn't belong there anyway. It is better (ie, more secure) to
keep such associations out of the content.

Paul Topping
Design Science, Inc.
www.dessci.com/mathplayer

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>

Ian Hickson

unread,
Sep 23, 2006, 8:08:52 PM9/23/06
to Paul Topping, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org, r...@maths.uq.edu.au
On Sat, 23 Sep 2006, Paul Topping wrote:
>
> If MathML is considered a subset of HTML5, then no namespace declaration
> would be necessary. However, if MathML is going to work in HTML that
> isn't declared as HTML5 (not clear to me from this thread), then the
> document would be poorly specified without it, IMHO.

As far as HTML5 UAs are concerned, declaring HTML as HTML5 consists of
labelling it as text/html. It isn't clear to me what you would consider
HTML that isn't declared as HTML5. With the exception of quirks which are
required for compatibility with de facto standards that disagree with de
jure standards, HTML has no practical versioning story -- all features
work in all documents, regardless of the official "version" of HTML used.


> At the risk of enciting an anti-Microsoft backlash, I should remind some
> on the list that IE has covered this territory before. They already have
> a mechanism for declaring XML islands in HTML that seems to work just
> fine.

XML data islands don't form part of the parent DOM (they are "islands", as
opposed to part of the document). I'm not sure how wrapping <xml> tags
around the MathML content would help. :-)


> And, I should have added that without a namespace declaration there
> would be no way to differentiate different versions of MathML. While
> most MathML instances are now MathML 2.0, the MathML 3.0 effort is just
> now starting up.

Why would you need to distinguish them? MathML2 is a superset of MathML1,
and (for all intents and purposes) any compliant MathML2 UA can process
any compliant MathML1 content. I would assume that this would continue to
be the case; if not, then this is IMHO a problem with MathML3.

Note that the namespace declaration can't currently distinguish between
MathML1 and MathML2, I don't see any reason why MathML3 would change this.

Chris Chiasson

unread,
Sep 24, 2006, 4:58:59 AM9/24/06
to
I don't understand. Aren't people who are savvy enough to generate
MathML also savvy enough to generate XHTML? Has anyone actually said,
"That MathML I can handle, but what's this XHTML?"

sha...@shantirao.com

unread,
Sep 24, 2006, 12:16:13 PM9/24/06
to
On 9/24/2006 1:58 AM, Chris Chiasson wrote:
> I don't understand. Aren't people who are savvy enough to generate
> MathML also savvy enough to generate XHTML? Has anyone actually said,
> "That MathML I can handle, but what's this XHTML?"

Savvy, yes. But also impatient. You will notice that HTML is what gets
used -- not XHTML.

I like write straight HTML with embedded LaTeX, then run it through a
translator to turn $exponents^2$ into MathML. Sure, HTML->XHTML
converters exist, but again, I'm lazy, selfish, and impatient.

Shanti

Chris Chiasson

unread,
Sep 24, 2006, 1:43:39 PM9/24/06
to
You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
What kind of tools are you using?

David Carlisle

unread,
Sep 25, 2006, 5:11:17 AM9/25/06
to i...@hixie.ch, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org

Ian

> XML data islands don't form part of the parent DOM (they are "islands", as
> opposed to part of the document). I'm not sure how wrapping <xml> tags
> around the MathML content would help. :-)

The syntax Paul was referring to here wasn't the <xml> convention, but
the ability in IE to have (explicitly prefixed) XML elements within an
HTML document with rendering controlled by an external component,
but _without_ any other flag at that point in the in the markup, such as
<xml> or <object> etc.

In the IE implementation you need to have an <object> in the head
pointing at the particular rendering component, which is fairly horrible
and also, you need to declare the namespace using (a variant of) an
early working draft namespace syntax using a PI, but as Paul said, those
parts needn't be copied. an example of a document using this syntax is
shown here:

http://www.dessci.com/en/products/mathplayer/author/creatingpages.htm#AnatomyMathPlayerWebPage

By using a different classid you can do the same thing to include
(explicitly prefixed) svg into an htm document and have it rendered by
Adobe's svg viewer, and in principle any other vocabularies (although I
don't personally know of any other implementations of this, except
techexplorer, which is again for MathML).

I'm not sure, having math more or less added directly to html would be
nice in many ways but I'm not sure how well it scales, if you think
people might want to have html+svg+chemml+... then perhaps having an api
that allows processing to be attached to namespaced elements would be
more general. On the other hand that was part of the reason for having
namespaces (and for that matter, xml itself) that people could serve all
sorts of different xml vocabularies and have clients do whatever is
necessary. I suspect part of the reason for "html5" is a feeling that
that never happened and isn't going to be mainstream any time soon, and
that a solution that directly addresses the fixed html vocabulary, with
perhaps two specific extensions such as svg and mathml will in practice
cover the vast majority of browser needs, and other vocabularies can be
transformed to html+.. before being served.

David

Ian Hickson

unread,
Sep 25, 2006, 1:38:59 PM9/25/06
to David Carlisle, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Mon, 25 Sep 2006, David Carlisle wrote:
>
> The syntax Paul was referring to here wasn't the <xml> convention, but
> the ability in IE to have (explicitly prefixed) XML elements within an
> HTML document with rendering controlled by an external component, but
> _without_ any other flag at that point in the in the markup, such as
> <xml> or <object> etc.

Oh, well, as noted earlier, the idea of namespace prefixes in HTML isn't
one that I personally am particularly fond of.


> I suspect part of the reason for "html5" is a feeling that that never
> happened and isn't going to be mainstream any time soon, and that a
> solution that directly addresses the fixed html vocabulary, with perhaps
> two specific extensions such as svg and mathml will in practice cover
> the vast majority of browser needs, and other vocabularies can be
> transformed to html+.. before being served.

I think that's pretty much exactly correct, yes.

sha...@shantirao.com

unread,
Sep 25, 2006, 10:44:15 PM9/25/06
to
On 9/24/2006 10:43 AM, Chris Chiasson wrote:
> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
> What kind of tools are you using?

A text editor, and itexMML, of course! Sure, more sophisticated tools
exist, but they aren't very reliable, are they?

Chris Chiasson

unread,
Sep 26, 2006, 8:00:53 AM9/26/06
to
How would transforming XHTML+LaTeX be harder than HTML+LaTeX with
itexMML?

William F Hammond

unread,
Sep 26, 2006, 5:35:39 PM9/26/06
to dev-tec...@lists.mozilla.org
sha...@shantirao.com writes:

> On 9/24/2006 10:43 AM, Chris Chiasson wrote:
>> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
>> What kind of tools are you using?
>
> A text editor, and itexMML, of course! Sure, more sophisticated tools
> exist, but they aren't very reliable, are they?

Oh?

I expect that the author of whatever more sophisticated tool you try
would like to hear of any lack of reliability you find.

Cheers.

-- Bill

Roger B. Sidje

unread,
Sep 26, 2006, 7:59:50 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

I don't like mlabeledtr very much (I have already expressed my views
about it to folks of the MathML WG), and would hope that they will take
my suggestion for <mtr label="..."> in MathML3. The former is
unnecessarily bloated and doesn't degrade gracefully at all with
renderers that don't support it (not to mention that it is hard to fit
in Gecko's existing table code).

However, your list misses some key tags, in particular leaf tags such as
<mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
<none/> are needed in <mmultiscripts> (albeit it can be argued that
<none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
differentiation is worthwhile).

In general, I would prefer the list to at least include all the tags
that we already support, and which existing webpages have come to depend
on. This effectively boils down to your list above, excluding
<mlabeledtr>, and including <mspace/>, <mprescripts/>, <none/> and
<mi>, <mn>, <ms>, <mtext>, <mo>. In particular, <mo> is a vital tag as
it is at the heart of those stretchy MathML characters.

Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
beginning of tag soup, it may be that the HTML parser would have to have
some knowledge of leaf tags, so that for example, a stray <mspace>
doesn't become the root of an entire HTML tree... which is later fed to
the hapless MathML engine. (The patch I attached in bug 353926 ignored
the issue.)
---
RBS

On 26/09/2006 3:59 AM, Ian Hickson wrote:
> On Sun, 24 Sep 2006, Boris Zbarsky wrote:
>
>>Ian Hickson wrote:
>>
>>>We didn't check that <canvas> wouldn't cause clashes, either.
>>
>>I see. I had assumed that we in fact had.
>>
>>
>>>I don't see why. We don't want a flag for when people can use the storage
>>>APIs. Or when they can use <img> elements. Or whatever.
>>
>>True, because those are very unlikely to collide with random stuff the pages
>>are doing (e.g. the storage APIs are using fairly long names that are unlikely
>>to collide with page-defined functions and variables).
>>
>>If we think MathML has a similarly low risk of collision, great.
>
>
> I don't know about "we".
>
> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction
>
> ...and of those only <math> came up at in the top 1000 elements in my
> search of elements on about one billion pages.
>
> According to that same research, <math> is, on the Web, less frequent than
> the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>,
> <lable>, <text-spez>, etc. It was present on less than 0.002% of the pages
> the research covered. (To give an idea of scale, <h8> is used on more than
> 0.003%, so if we avoid <math> because of this, we should probably
> introduce <h7> and <h8> into HTML, since we're saying that's an important
> enough level to worry about.)
>
> Now, of course, it could be that those 0.002% of pages are all hugely
> important and that we'll break the Web in adding this feature. We can't
> know until we've tried.
>

Ian Hickson

unread,
Sep 26, 2006, 8:16:07 PM9/26/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG), and would hope that they will take
> my suggestion for <mtr label="..."> in MathML3. The former is
> unnecessarily bloated and doesn't degrade gracefully at all with
> renderers that don't support it (not to mention that it is hard to fit
> in Gecko's existing table code).

I'm happy to drop/add any tag to this list. Just give me the list you
want.


> However, your list misses some key tags, in particular leaf tags such as
> <mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
> <none/> are needed in <mmultiscripts> (albeit it can be argued that
> <none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
> differentiation is worthwhile).

I missed anything that wasn't in the table I happened upon in the spec. I
didn't look very closely for the exact table I wanted.

Tell me what tags you want to have and we'll make that the list. You're
the expert. :-)


> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
> beginning of tag soup, it may be that the HTML parser would have to have
> some knowledge of leaf tags, so that for example, a stray <mspace>
> doesn't become the root of an entire HTML tree... which is later fed to
> the hapless MathML engine. (The patch I attached in bug 353926 ignored
> the issue.)

Don't worry, these tags auto-close when a parent tag is closed.

<foo><bar><baz></foo><quux>

...results in this DOM:

<foo>
<bar>
<baz>
<quux>

For leaf nodes with following siblings, people will have to use end tags,
as in:

<foo><bar></bar><baz></baz></foo><quux></quux>

If we want to start adding actual leaf tags, I'd rather do this in a
second stage, after we have a proof of concept. (I've so far avoided
adding any new tags to the HTML5 parser spec, but eventually there will be
a bunch we have to add.)

We can go from non-empty to empty much more easily than from empty to
non-empty.

Roger B. Sidje

unread,
Sep 26, 2006, 9:03:25 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On 27/09/2006 10:16 AM, Ian Hickson wrote:

> I'm happy to drop/add any tag to this list. Just give me the list you
> want.

OK.

> For leaf nodes with following siblings, people will have to use end tags,
> as in:
>
> <foo><bar></bar><baz></baz></foo><quux></quux>
>
> If we want to start adding actual leaf tags, I'd rather do this in a
> second stage, after we have a proof of concept. (I've so far avoided
> adding any new tags to the HTML5 parser spec, but eventually there will be
> a bunch we have to add.)

OK, I see.

The other issue are those 2000 entities that MathML has. You said that
you are not a big fan of a namespace thingy on the root <html> element.

Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
W3C entities _by default_? We have a proof-of-concept of that in View
Selection Source, BTW. It will display any entity it can.
http://lxr.mozilla.org/mozilla/source/content/base/public/nsIDocumentEncoder.idl#125
As VSS has underwent the test of time without major complaints, perhaps
<!DOCTYPE html> could assume that too? If that is agreed, we are all clear.

The other remaining issue might be with style matching because <math>
will then be internally in the MathML namespace whereas the HTML
document is in the none namespace (at present), but we will see how it
goes from there.
---
RBS

Ian Hickson

unread,
Sep 26, 2006, 9:23:38 PM9/26/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> The other issue are those 2000 entities that MathML has.

Yeah... Do we really need those? Some of them seem reasonable to add, but
2000 seems like too many for the mnemonic advantage to beat just using
Unicode codepoints...

The problem with adding entities is that a LOT of people do things like

href="/u?aa=foo&ab=foo&ac=foo&ad=foo"

...which today works, but would break if MathML entities were introduced
(since &ac is a MathML entity).


> Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
> W3C entities _by default_?

Don't do anything based on the DOCTYPE. HTML5 is anything sent as
text/html.


> The other remaining issue might be with style matching because <math>
> will then be internally in the MathML namespace whereas the HTML
> document is in the none namespace (at present), but we will see how it
> goes from there.

I don't see why this would cause any problems.

Roger B. Sidje

unread,
Sep 26, 2006, 11:10:17 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On 27/09/2006 11:23 AM, Ian Hickson wrote:
>
> The problem with adding entities is that a LOT of people do things like
>
> href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
>
> ...which today works, but would break if MathML entities were introduced
> (since &ac is a MathML entity).
>

That list is so big that trying to hand-pick some and leaving some out
would need another committee...

>>Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
>>W3C entities _by default_?
>
>
> Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> text/html.

I thought the DOCTYPE was trustworthy -- based on this excerpt from the
HTML5 spec:

"HTML documents that use the new features described in this
specification must start with the string <!DOCTYPE html> and, if they
are served over the wire (e.g. by HTTP) must be labelled with the
text/html MIME type."

If so, it would have meant less conflicts with agreed entities in HTML5.

BTW, for my own information, do you intent HTML5 to be transitional,
almost-standards, or strict? If it is HTML5 (or XHTML5) served as
text/html but put in the XHTML namespace at some later stage (as the
HTML5 implies), it better be strict, no? And that would be driven by the
DOCTYPE detection code. Catch my drift? Or is tag soup going to be in
the XHTML namespace?

If it is strict then maybe entities could be required to have a
semi-colon -- which will then avoid the ambiguities you mentioned above.

Not that I have a position on this (at least as yet). I am just bringing
in some food for thoughts, to accommodate the realistic issues of MathML.
---
RBS

Ian Hickson

unread,
Sep 27, 2006, 1:59:04 AM9/27/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
> On 27/09/2006 11:23 AM, Ian Hickson wrote:
> >
> > The problem with adding entities is that a LOT of people do things
> > like
> >
> > href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
> >
> > ...which today works, but would break if MathML entities were
> > introduced (since &ac is a MathML entity).
>
> That list is so big that trying to hand-pick some and leaving some out
> would need another committee...

Not really... I say we just add ApplyFunction, InvisibleComma, and
InvisibleTimes (but not their short aliases).


> > > Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting
> > > all W3C entities _by default_?
> >
> > Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> > text/html.
>
> I thought the DOCTYPE was trustworthy -- based on this excerpt from the
> HTML5 spec:
>
> "HTML documents that use the new features described in this
> specification must start with the string <!DOCTYPE html> and, if they
> are served over the wire (e.g. by HTTP) must be labelled with the
> text/html MIME type."

That's an authoring conformance requirement, and has no bearing on
implementations.


> BTW, for my own information, do you intent HTML5 to be transitional,
> almost-standards, or strict?

HTML5 documents starting with <!DOCTYPE HTML> must be in standards mode.
Documents with other DOCTYPEs or no DOCTYPE at all may be in another mode,
as already described in the spec. In due course I may specify quirks mode
and then there'll just be the spec, and no other modes.


> If it is HTML5 (or XHTML5) served as text/html but put in the XHTML
> namespace at some later stage (as the HTML5 implies), it better be
> strict, no? And that would be driven by the DOCTYPE detection code.
> Catch my drift? Or is tag soup going to be in the XHTML namespace?

Not sure what you mean my that. All HTML DOM nodes are (per HTML5) in the
XHTML namespace, irrespective of the standards/quirks thing.


> If it is strict then maybe entities could be required to have a
> semi-colon -- which will then avoid the ambiguities you mentioned above.

That would break back-compat.

William F Hammond

unread,
Sep 27, 2006, 12:25:11 PM9/27/06
to dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
Ian Hickson <i...@hixie.ch> writes:

> On Wed, 27 Sep 2006, Roger B. Sidje wrote:

> . . .


>> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the

>> beginning of tag soup, ...


>
> Don't worry, these tags auto-close when a parent tag is closed.

Two points for clarification:

1. There's the old issue, related to dual parsers, of trying to get
Mozilla family user agents to give proper handling of XHTML+MathML
when served through text/html -- following early Amaya practice. (In
the end the W3C HTML WG refused to support this idea and spawned the
mimetype application/xhtml+xml.) It seems that formally correct
XHTML+MathML would now gain coverage as text/html under current WhatWG
thinking, at least when XML namespaces are evident only through use of
the xmlns attribute (which would be ignored in tag soup), i.e., no use
of xml namespace prefixing. Is this correct?

2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers
will generate MathML content that's good enough for Mozilla rendering?

---

In case you don't know:

The W3C Math group has announced that it is beginning to think seriously
about author-level markup for math.

Long term -- say ten years in the future (we've already been at this
for ten years) -- I think author level math additions to the tag soup
vocabulary would work out much better, especially with enhanced CSS
support.

Cheers.

-- Bill

----------------------------------------------------------------------
William F. Hammond Dept. of Mathematics & Statistics
518-442-4625 The University at Albany
hammond At math.albany.edu Albany, NY 12222 (U.S.A.)
http://www.albany.edu/~hammond/ Dept. FAX: 518-442-4731
----------------------------------------------------------------------

David Carlisle

unread,
Sep 27, 2006, 12:44:42 PM9/27/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
I don't think I saw Ian's original comment, Just Roger's reply?

> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

You would beed to include the leaf elements (mi mn mo mtext) otherwise
there'll be no characters in the mathml!, also mspace is pretty
important.

But a more general point I think it's dangerous for a spec to be
profiled by _implementations_. The Math WG activity has just been
restarted at W3C and if there is a need to profile MathMl to
presentation MathML (or a subset thereof) please can it be done _there_
so that there is some chance that mathml authoring tools can be
customised to have options to generate code to match any profiled spec.

> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG)

Roger, I don't see anything searching for
http://www.w3.org/Search/Mail/Public/search?type-index=www-math&index-type=t&keywords=mlabeledtr&search=Search
I know you've talked to us at conferences etc, but we're all getting old
and if comments aren't on the comment list, then they are likely to get
forgotten over time.

_Now_ would be a really good time to make such comments as we are in the
process of finalising the requirements for what extar features should
be in MathML3, and what if necessary, features should be deprecated.


I don't remember specific discussions about an <mtr label="..."> I
would guess there woul dbe some convern about the label being an
attribute rather than an element restricting the possibilities, but
implementation advice on difficulties on teh current schem woul dbe
taken seriously....

Ian wrote about entities


> Yeah... Do we really need those? Some of them seem reasonable to add, but
> 2000 seems like too many for the mnemonic advantage to beat just using
> Unicode codepoints...

I'd say that it's probably not worth including only a few, it would just
lead to confusion. The problem is that much mathml is generated using
tools and those tools may use entities, and if they do that the user
hasn't much control over which are used, and how to fix things to remove
entities that are not supported in the browser. It would be better to
just get the MathML authoring tools to use characters or character refs
directly and tell the user mathml entities are not supported (but html
ones are)

David

Ian Hickson

unread,
Sep 27, 2006, 2:06:36 PM9/27/06
to William F Hammond, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, William F Hammond wrote:
>
> 1. There's the old issue, related to dual parsers, of trying to get
> Mozilla family user agents to give proper handling of XHTML+MathML when
> served through text/html -- following early Amaya practice. (In the end
> the W3C HTML WG refused to support this idea and spawned the mimetype
> application/xhtml+xml.) It seems that formally correct XHTML+MathML
> would now gain coverage as text/html under current WhatWG thinking, at
> least when XML namespaces are evident only through use of the xmlns
> attribute (which would be ignored in tag soup), i.e., no use of xml
> namespace prefixing. Is this correct?

I'm confused by your terminology.

MathML using namespaces and XML syntax would not, under the WHATWG
proposals here, be formally correct. XML sent as text/html is never
correct per the "WHATWG thinking".

What is being proposed here is a non-XML syntax, to be formally described
in the HTML5 specification, which, went processed by an HTML5 UA, would
generate a DOM that can then be processed per the MathML2 specification.

Per the WHATWG specifications, the presence of an "xmlns" attribute is
always a conformance error in any content sent as text/html.


> 2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers
> will generate MathML content that's good enough for Mozilla rendering?

The idea being entertained is that off-the-cuff HTML5 authors, and HTML5
editors, would create content which, when processed by an HTML5 UA (such
as Mozilla, in due course), would render as MathML markup would.


> The W3C Math group has announced that it is beginning to think seriously
> about author-level markup for math.
>
> Long term -- say ten years in the future (we've already been at this for
> ten years) -- I think author level math additions to the tag soup
> vocabulary would work out much better, especially with enhanced CSS
> support.

On the very short term, the proposal here is just a proof of concept. On
the medium term (12 months) I was considering specifying more complex
parsing rules for MathML such that the same MathML2-compatible DOM could
be obtained from much smaller markup, e.g. by implying <mo> tags around
operators and <mn> tags around numbers.

HTH,

Roger B. Sidje

unread,
Sep 28, 2006, 4:52:26 AM9/28/06
to David Carlisle, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
On 28/09/2006 2:44 AM, David Carlisle wrote:

> I don't remember specific discussions about an <mtr label="..."> I
> would guess there woul dbe some convern about the label being an
> attribute rather than an element restricting the possibilities, but
> implementation advice on difficulties on teh current schem woul dbe
> taken seriously....

Here is an informative thread about it:
http://groups.google.com/group/netscape.public.mozilla.mathml/browse_thread/thread/d77d015a1fffc6fb/5b0eb0cc9724ce72
(not on www-math, though. Maybe I should forward it there?)

It appeared that attributes (like those in <mfenced>) aren't unanimous
either. But having a bloated tag that won't be implemented in the next
several years isn't really helpful.

> Ian wrote about entities
>
>>Yeah... Do we really need those? Some of them seem reasonable to add, but
>>2000 seems like too many for the mnemonic advantage to beat just using
>>Unicode codepoints...
>
> I'd say that it's probably not worth including only a few, it would just
> lead to confusion.

I am actually a fan of entities because they improve readability a fair
bit. I hope Ian won't give up thinking on this issue so quickly...
especially in the context of MathML where strange characters are quite
common.

As to my suggestion that "if [a document] is strict then maybe entities

could be required to have a semi-colon -- which will then avoid the

ambiguities", to which Ian responded that, "That would break back-compat."

We have other cases of broken back-compat. -- where users were told to
use a non-strict DOCTYPE or some other workaround, e.g, line-height of
images.
---
RBS

David Carlisle

unread,
Sep 28, 2006, 5:24:59 AM9/28/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org

Roger,
Thanks for the link on <mtr label="mylabel">,

> It appeared that attributes (like those in <mfenced>) aren't unanimous
> either.

yes mfenced also "suffers" from requiring attributes, but probably one
is more likely to need markup in an equation label than in a stretchy
operator. It's not so uncommon to want superscript * or daggers etc to
highlight special versions of formulae, and mfenced is explictly a
shorthand form so you can always use the mwrow/mo form if you need an
operator that is "decorated" in some way. That would not be the case
here if mlabeledtr were deprecated and an attribute form was the
only version. (Actually it would if the attribute could then be
css-styled using css generated content. Allowing css (or other
mechanism) auto numbering is I think a highly requested feature for
mathml3.


> (not on www-math, though. Maybe I should forward it there?)

Yes please do. When we are doing a pass for errata or pulling in feature
requests for a new version we can do a more or less exhaustive check of
the official comment list but (even with google's help) doing an
exhaustive check of the entire web's a bit hard:-)


The charter for the current working group

http://www.w3.org/Math/Documents/Charter2006.html

has as one of its headline work items

Extension of MathML with enhanced support for equation labeling,
including automatic numbering, general label placement and style, and
resolution of references.

so getting that specified out in a way that ensures that implementations
can implement it sounds like a good idea, and the timiming is good now
to get new features in this area if that is needed. If WhatWG members
are interested in mathml most of them are w3c members and could join the
WG of course (currently only Opera is represented out of the main
browser vendors) But WG membership isn't really needed we can do the
technical discussion on the public www-math list if that is appropriate.

> I am actually a fan of entities because they improve readability a fair
> bit.

Well as you know I've invested a frightening number of houres maintaining
that entity set (and the draft iso set at www.w3.org/2003/entities,
which is the same thing, really) so I'm also think they are valuable,
although it's a kind of love-hate relationship most of the time:-)

> I hope Ian won't give up thinking on this issue so quickly...
> especially in the context of MathML where strange characters are quite
> common.

Yes I think the ideal situation is that they all be allowed. My comment
was that subsetting them is likely to be more confusing than helpful.

> As to my suggestion that "if [a document] is strict then maybe entities
> could be required to have a semi-colon -- which will then avoid the
> ambiguities", to which Ian responded that, "That would break back-compat."

Requiring a ; would seem reasonable to me (ie make the lack of a ; make
the & into an implict &amp; rather than be an error as in xml).
That does have a theoretical backward compatibility problem in that
&rightarrow; would be an arrow instead of &amp;rightarrow; but I would
have thought that the occurrences of any such construction outside of
test suites was rather rare.

David

White Lynx

unread,
Sep 28, 2006, 8:38:50 AM9/28/06
to
I consider switching from XML to text/html as inappropriate and
pointless development, morover it is damaging in long term perspective.


First of all it is unclear where this idea comes from, as MathML
community has no legacy text/html content that one should care about.
All MathML content is wellformed (by definition), which means that one
has less errors in MathML documents comparing to what one would have in
tagsoup approach, it also means that all MathML content can can be
handled with XML tools, can be processed with XSLT, matched using
XPath, mixed with other XML based markup languages (OpenMath, SVG) etc.
There is no single MathML implementation that supports text/html
tagsoup, but does not support X(HT)ML, while inverse is not true, there
are XML only MathML implementations that by definition have nothing to
do with HTML legacy.

Further it is not clear for me why this has to be done today, after
paying price for wellformedness and tackling XML related problems for
seven years, when finally MSIE/MathPlayer accept application/xhtml+xml
and thus allow people to deliver the same XHTML+MathML to
MSIE/MathPlayer and Mozilla (one can add Opera with UserJS) someone
decides to revert (more precisely convert) everything to tagsoup.

Profiling policy is sounds unclear and strange to me. Solving issue on
the level "I'm happy to drop/add any tag to this list. Just give me the
list you want" or based on MathML support level on some particular
implementation seems to be irresponsible.
There are at least two subgroups in W3C Math WG that one could drop a
message with profile proposal to after looking at "wrong table".
One is called liason with WhatWG subgroup and as name suggests is
expected to ensure that needs of MathML are addressed in WhatWG specs.
Another is liason with CSS subgroup, which is expected to define MathML
profile suitable for usage in XML+CSS framework and a few CSS
extensions needed to format proposed MathML profile.
There is also subgroup that deals with compound document formats. My
opinion is that profiling of MathML should be coordinated with these
units as irresponsible steps may spoil W3C efforts in the same area.

One more thing that sounds unlogical and rather strange is that
Mozilla/WhatWG try to move MathML further from XML+CSS framework,
by converting XML to tagsoup with ad hoc parsing rules and embracing
constructions like mstyle, mpadded in "proposed" profile.

Message has been deleted

Ian Hickson

unread,
Sep 28, 2006, 2:45:36 PM9/28/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, David Carlisle, dev-tec...@lists.mozilla.org
On Thu, 28 Sep 2006, Roger B. Sidje wrote:
> >
> > Ian wrote about entities
> >
> > > Yeah... Do we really need those? Some of them seem reasonable to add, but
> > > 2000 seems like too many for the mnemonic advantage to beat just using
> > > Unicode codepoints...
> >
> > I'd say that it's probably not worth including only a few, it would just
> > lead to confusion.
>
> I am actually a fan of entities because they improve readability a fair
> bit. I hope Ian won't give up thinking on this issue so quickly...
> especially in the context of MathML where strange characters are quite
> common.

I really don't want to start introducing weird rules for parsing entities
(I'm trying to simplify the entity parsing rules, not make them worse). At
least not at this stage. Maybe once we have a proof-of-concept working, it
would make more sense to revisit the issue, but I'd want to do a thorough
scan of the Web to see how common these entities actually are today.


> As to my suggestion that "if [a document] is strict then maybe entities
> could be required to have a semi-colon -- which will then avoid the
> ambiguities", to which Ian responded that, "That would break
> back-compat."
>

> We have other cases of broken back-compat. -- where users were told to
> use a non-strict DOCTYPE or some other workaround, e.g, line-height of
> images.

Yeah. And we can see how well _that_ went. QA nightmare, multiple
overlapping codepaths, obscure bugs, confused authors, contradicting
documentation, etc. Let's not go there again. The whole point of
MathML-in-HTML is to have back-compat work -- if we didn't care about
back-compat, we would just have people use MathML-in-XHTML.

Roger B. Sidje

unread,
Sep 28, 2006, 9:45:46 PM9/28/06
to David Carlisle, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
On 28/09/2006 7:24 PM, David Carlisle wrote:

> Roger,
> Thanks for the link on <mtr label="mylabel">,
>
>
>>It appeared that attributes (like those in <mfenced>) aren't unanimous
>>either.
>
>
> yes mfenced also "suffers" from requiring attributes, but probably one
> is more likely to need markup in an equation label than in a stretchy
> operator. It's not so uncommon to want superscript * or daggers etc to
> highlight special versions of formulae, and mfenced is explictly a
> shorthand form so you can always use the mwrow/mo form if you need an
> operator that is "decorated" in some way. That would not be the case
> here if mlabeledtr were deprecated and an attribute form was the
> only version. (Actually it would if the attribute could then be
> css-styled using css generated content. Allowing css (or other
> mechanism) auto numbering is I think a highly requested feature for
> mathml3.

The danger (and problem) with that tag is that it is over-designed to
accommodate the tiny set of special-cases you alluded to, while holding
the 99.99% majority of cases hostage. One could put up with CDATA all
the way, e.g., (6') or (7*), (8&dagger;), (9a), etc -- if a subequation
is really needed. I would think we can put with this and reap the
benefits. A <mtr label="mylabel"> tag that stands a chance, degrades
gracefully, *free* cross-referencing (with href#mylabel -- by just
invoking what the browser already does with <a name="...">), the
counters that you mentioned (which work in Gecko today, BTW), etc.
(Also conceivable, optimistically, is a pseudo-class :label to style the
label text, but we might going ahead of ourselves...)

Seems to me that the concrete benefits that might result outweigh the
feeling against an attribute.

>
>>(not on www-math, though. Maybe I should forward it there?)
>
> Yes please do.

OK.

> Well as you know I've invested a frightening number of houres maintaining
> that entity set (and the draft iso set at www.w3.org/2003/entities,
> which is the same thing, really) so I'm also think they are valuable,
> although it's a kind of love-hate relationship most of the time:-)

Yeah. Let's hope Ian is listening and keeps these entities on his radar...
---
RBS

Juan R.

unread,
Sep 29, 2006, 4:04:20 AM9/29/06
to
Ian Hickson wrote:
>
> I'm happy to drop/add any tag to this list. Just give me the list you
> want.
>

Ok, this is one in LISP syntax for lists: ()

>
> --
> Ian Hickson U+1047E )\._.,--....,'``. fL
> http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
> Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

No need to reply the rest you are promoting, since basically you may
think -parodying you- that MathML in HTML 5 is <anything sent as
text/html>

Far from simplifying the authoring of mathematical docs and spreading
online maths, you are really doing comunication more difficult still
for all of us with this strange hibrid convincing nobody.


Juan R.

Center for CANONICAL |SCIENCE)

David Carlisle

unread,
Sep 29, 2006, 4:40:40 AM9/29/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org

> Seems to me that the concrete benefits that might result outweigh the
> feeling against an attribute.

Which is why it's good to get real implementation experience into the
language design (or update). Either by implementors joining the WG or
by doing the technical design on the public www-math list so you and
others can join in (or both).

David


sha...@shantirao.com

unread,
Oct 2, 2006, 12:13:04 AM10/2/06
to
I would like to call attention to RBS's original point: MathML is
emperiled, and something *can* be done. To expound:

1. MathML is nifty. It's the best thing since LaTeX. In fact, it's the
only thing since LaTeX. My colleagues admire my Mozilla-rendered
documents while they struggle with MS Word.

2. MathML is in trouble. My colleagues who use IE can't see my
equations. This makes it unacceptable for me to write anything important
in MathML, so long as I want to succeed at my job. So investing effort
into learning or using MathML is a quixotic proposition.

3. XHTML is the web language of the future -- and it always well be. It
might as well be dead. It was born crippled, and it never will catch on,
for the simple economic reason that HTML is easier to use. XHTML was
supposed to be the replacement of HTML. In fact, it was so popular that
we're moving forward with HTML5.

4. Languages that are not easy to write are ignored. The wasteland of
obsolete internet standards is littered with romatic, intellectually
superior, morally defensible languages like XFORMS, VRML, and SVG. Boy,
those sure made our lives better! Compare those to what actually gets
used: unvalidated HTML, CSS, JavaScript, and the DOM. All marginally
self-consistent languages that are easy to write and tolerant of abuse.

5. If MathML is not widely understood and easily used by browsers, say
by being a part of HTML5, then sites that drive technology adoption,
like Wikipedia, will have no incentive to switch from the current
TeX->PNG kludge. Lacking a large user base, MathML will not grow.

6. Although the MathML community is self-contained today, we all know
what happens to species that evolve on islands: they get smaller and
prone to extinction. The community needs to grow, and incorporation in
HTML5 is something we should all get behind.

Shanti

* Camel = a horse designed by committee

Chris Chiasson

unread,
Oct 2, 2006, 1:10:45 AM10/2/06
to
The reason MathML is in trouble is because Microsoft hasn't implemented
it (and many other good XML technologies) natively into Internet
Explorer. They are using their intertia to screw over the open
standards. It's hard for Firefox to compete in a (MathML) market that
doesn't exist.

The best thing that could be done without MS help is to make XML
handling plugins as ubiquitous and easy to install as Adobe's
Macromedia Flash plugin.

Maybe it would be prudent to make an open source MathML and SVG plugin
for IE so people don't have to rely on the changing winds of corporate
desires and licensing. You could call it something like Firefox
sub-rendering for IE - or whatever.

White Lynx

unread,
Oct 2, 2006, 5:33:30 AM10/2/06
to
> XHTML is the web language of the future -- and it always well be. It
> might as well be dead.

MathML is not necessary confined to XHTML, it may use other XML
application as host languages. In particular one can name several XML
applications that are much more suitable for encoding scientific
articles then XHTML (NIH Journal Publishing DTD, DocBook, TEI). Of
course XHTML will remain to be the most widespread host language for
MathML, but it is not something that MathML absolutely depends on. And
XML in general is apparently not dead, it is enough for MSIE to fix
their broken parser and the people that yesterday argued that we all
must switch to XHTML, today argue that HTML5 is the only way to go,
tomorrow may adjust their opinion once more. It should not be a
problem.

> Languages that are not easy to write are ignored

Well compare XML
<mmultiscripts><mi>A</mi><mprescripts/><none/><mi>B</mi></mmultiscripts>
and HTML
<mmultiscripts><mrow>A</mrow><mprescripts></mprescripts><none></none><mrow>B</mrow></mmultiscripts>
that being processed by parser will generate mi-mo-mu tagsoup
automatically
<mmultiscripts><mrow><mi>A</mi></mrow><mprescripts></mprescripts><none></none><mrow><mi>B</mi></mrow></mmultiscripts>
So how switching to HTML helped to make language human processable?

I am definetely for turning MathML into human processable language, and
removing mi-mo-mu (explicit markup is useful for stuff like integrals,
N-ary operators, delimiters, but otherwise it is just bloat
<mn>2</mn><mo>+</mo><mn>2</mn><mo>=</mo><mn>4</mn>), however this can
be done and should be done whithin XML, without introducing telephatic
parsing rules. If mi-mo-mu are not available in original source and are
generated by parser then their semantic value is exactly zero (and yes
I know that it is close to zero in any case). ECMA approach is one
possible way to remove mi-mo-mu and add use something like <nary> (but
not exactly <nary> construction which is the most CSS unfriendly part
of ECMA math markup) for operators and just <i> for italic.
So we should either remove it from MathML (the problem however is lack
of consensus in WG on issue) or keep it. Removing it from source but
keeping in DOM does not make any sense, as you remove semantics but
keep this stuff in DOM.

> Although the MathML community is self-contained today, we all know
> what happens to species that evolve on islands: they get smaller and
> prone to extinction.

Integration with environment in which formulae are embedded is crucial
for any mathematical markup. All other approaches are closesly
integrated in some extensible framework with powerful formatting
mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).
Extensibility and availablility of fullfeatured style language or
equivalent formatting mechanism are crusial here. In case of MathML
environment is web, so integration of MathML into extensible framework
is integration into XML+CSS+DOM which is on agenda of Math WG. In
contrast HTML5 does not give us extensible framework and ad hoc parsing
rules does not help us to integrate MathML with CSS while keepind DOM
synchronised with actual markup.

William F Hammond

unread,
Oct 2, 2006, 11:23:32 AM10/2/06
to dev-tec...@lists.mozilla.org
"Chris Chiasson" <chris.c...@gmail.com> writes:

> ...


> The best thing that could be done without MS help is to make XML
> handling plugins as ubiquitous and easy to install as Adobe's
> Macromedia Flash plugin.

I acquired a new machine with MS Windows XP (Home) recently and found
that it had both IE and AOL/NetScape visible as desktop icons. Of
course, NetScape rendered XHTML+MathML.

Installing the Design Science plugin for IE called MathPlayer was
quite easy, but one does need to know where to go to get it. So the
math community might consider advertising its location -- or at least
advising its readers to google for "MathPlayer".

> Maybe it would be prudent to make an open source MathML and SVG plugin
> for IE so people don't have to rely on the changing winds of corporate
> desires and licensing. You could call it something like Firefox
> sub-rendering for IE - or whatever.

What was new for me about the OEM NetScape was that it would render in
IE mode if asked.

It's curious that an OEM platform should include NetScape along with
IE but not provide a seamless plugin for IE. Perhaps Microsoft will
want to rethink that.

-- Bill

Paul Topping

unread,
Oct 2, 2006, 12:16:39 PM10/2/06
to William F Hammond, dev-tec...@lists.mozilla.org
Hi,

Thanks Bill (Hammond) for mentioning our MathPlayer plugin. While I
understand that people might want IE to support MathML "out of the box",
many capabilities in many apps are provided as plugins. I don't think it
is right to think that all plugins are bad. Plugins allow company's like
mine, with an interest in providing technology in a particular area, to
move technology forward independently of monsters like Microsoft. In
other words, if Microsoft provided MathML support in IE, it wouldn't be
as good as MathPlayer and everyone would be complaining about that.

Of course, demanding that Microsoft support XHTML in IE is perfectly
reasonable. IE does a really good job, IMHO, of allowing plugins like
MathPlayer support embedded XML languages, except in HTML, not XHTML.
MathPlayer works around this by allowing carefully prepared XHTML+MathML
to work in IE but proper support for XHTML in IE would be better.

Paul Topping
President & CEO

Design Science, Inc.
"How Science Communicates"
Makers of MathType, MathFlow, WebEQ, MathPlayer, Equation Editor,
TeXaide
http://www.dessci.com

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>

Chris Chiasson

unread,
Oct 2, 2006, 2:31:40 PM10/2/06
to
White Lynx wrote:
>All other approaches are closesly
> integrated in some extensible framework with powerful formatting
> mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).

I don't know how common the knowledge is, but MathML is closely tied
with a certain platform: Mathematica

Wolfram Research (makers of Mathematica) was one of the originators of
MathML. Anyway, present day MathML is strongly related to Mathematica's
internal representation of math, as shown in this short example.

Consider Euler's formula as entered in the most source-code like syntax
available in Mathematica (called InputForm):

E^(I*x)==Cos[x]+I*Sin[x]

After parsing, it becoms this (called FullForm):

Equal[Power[E,Times[Complex[0,1],x]],Plus[Cos[x],Times[Complex[0,1],Sin[x]]]]

Compare this with content MathML

<math
xmlns='http://www.w3.org/1998/Math/MathML'><apply><eq/><apply><power/><
exponentiale/><apply><times/><imaginaryi/><ci>x</ci></apply></apply><apply><
plus/><apply><cos/><ci>x</ci></apply><apply><times/><imaginaryi/><apply><sin/>
<ci>x</ci></apply></apply></apply></apply></math>

Notice how <apply> is used to capture the structure of
head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.

However, when an equation like this is typeset in Mathematica, it is
converted to a box structure. I'll use StandardForm boxes for this
example:

RowBox[{SuperscriptBox["\[ExponentialE]",RowBox[{"\[ImaginaryI]","
","x"}]],"\[Equal]",RowBox[{RowBox[{"Cos","[","x","]"}],"+",RowBox[{"\[ImaginaryI]","
",RowBox[{"Sin","[","x","]"}]}]}]}]

Note that RowBox means that the items within should have the same
baseline. Obviously, the possible necessity to linebreak complicates
things somewhat. However, the box structure remains invariant (which is
why I think it's odd that Firefox doesn't linebreak <mrow>).

Compare this with presentation MathML:

<math
xmlns='http://www.w3.org/1998/Math/MathML'><mrow><msup><mi>&#8519;</mi>
<mrow><mi>&#8520;</mi><mo>&#8290;</mo><mi>x</mi></mrow></msup><mo>&#63449;</
mo><mrow><mrow><mi>cos</mi><mo>&#8289;</mo><mo>(</mo><mi>x</mi><mo>)</mo></
mrow><mo>+</mo><mrow><mi>&#8520;</mi><mo>&#8290;</mo><mrow><mi>sin</mi><mo>&#
8289;</mo><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></mrow></mrow></math>

So those plentiful <mrow> elements shouldn't be unexpected. Also, it
becomes pretty apparent why presentation MathML is nearly
incomprehensible. It is a representation of an already verbose two
dimensional box formatting system in XML, making it even more verbose.

Of course, the fact that presentation MathML is a translation of a box
formatting system makes it well suited to styling by CSS.

Mathematica's box formatting subsystem (called the FrontEnd)
understands very few operators (input shortcuts). One of the few is
Rule (lhs->rhs). It doesn't even understand Plus (l+m+r). It certainly
wouldn't understand what's going on if someone left out a RowBox.

In that respect, presentation MathML is slightly more flexible, because
it can "insert" some implicit row boxes when the markup wouldn't make
sense otherwise.

Anyway, I don't speak for WRI, but I think it's fairly obvious they
will try to keep MathML "in their image" so that it will be easy for
them to have an XML language for math that is understood by machines
... aka their computer algebra system.

IMHE (in my humble estimation) Firefox people would be better off
trying to define "shorthand" definitions for the content MathML system,
which WRI will be less likely to oppose.

White Lynx wrote:
> Extensibility and availablility of fullfeatured style language or
> equivalent formatting mechanism are crusial here.

Agreed. I think it would be imprudent to remove formatting structures
from presentation MathML because that would make it harder to write
appropriate CSS.

Paul Topping

unread,
Oct 2, 2006, 3:00:07 PM10/2/06
to Chris Chiasson, dev-tec...@lists.mozilla.org
Chris,

While Mathematica people were heavily involved in MathML's creation, it
is hardly the result of their effort alone. They provided some much
needed early impetus and hosted two MathML conferences but since then
they have been more noticeable by their absence from the MathML
community. At any rate, the notion that they have some kind of control
over it now is just not even close to being the case.

If anyone has opinions on how MathML can be improved, they should
participate in the W3C's MathML 3.0 effort just getting underway. Then
they can see for themselves that Wolfram/Mathematica doesn't run the
show. Actually, I half expected someone to accuse my company, Design
Science, of that these days.

I would encourage anyone to create front ends that save as MathML.
Either GUI ones like our products or "programming" languages that are
converted into MathML. Now that MathML has been fairly well established
as the XML representation for math, ease of conversion should be a goal
for any front end. However, IMHO ease of use should take priority over
this.

Paul Topping
Design Science, Inc.
www.dessci.com

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf
> Of Chris Chiasson
> Sent: Monday, October 02, 2006 11:32 AM
> To: dev-tec...@lists.mozilla.org
> Subject: Re: MathML-in-HTML5
>

Jacques Distler

unread,
Oct 2, 2006, 10:44:35 PM10/2/06
to
In article
<mailman.6128.115938040...@lists.mozilla.org>, Ian
Hickson <i...@hixie.ch> wrote:


>What is being proposed here is a non-XML syntax, to be formally described
>in the HTML5 specification, which, went processed by an HTML5 UA, would
>generate a DOM that can then be processed per the MathML2 specification.
>

> ...


>
>On the very short term, the proposal here is just a proof of concept. On
>the medium term (12 months) I was considering specifying more complex
>parsing rules for MathML such that the same MathML2-compatible DOM could
>be obtained from much smaller markup, e.g. by implying <mo> tags around
>operators and <mn> tags around numbers.

Please don't go down that road.

Let's not have two incompatible markup languages, both called "MathML,"
one of which can be embedded in HTML5, the other in XHTML.

If you want MathML-in-HTML5, create a profile (along the lines of
XHTML's Appendix C) of MathML 2.0 that is safe to consume by the
Tag-Soup parser.

--
PGP public key: http://golem.ph.utexas.edu/~distler/distler.asc

Roger B. Sidje

unread,
Oct 3, 2006, 1:48:17 AM10/3/06
to White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
On 28/09/2006 10:44 PM, White Lynx wrote:

> I consider switching from XML to text/html as inappropriate and
> pointless development, morover it is damaging in long term perspective.

Damaging to what? To MathML? Not really in my opinion. What damage could
there be to have plenty of MathML formulas on the web?!? But to the
XML/XHTML agenda, possibly. And that has been the real "problem" since
the beginning, and which I alluded to in my opening post. It wasn't a
fight fitted for a niche MathML that was already struggling to make a
name for itself.

Interested in using MathML? First pass that XHTML barrier, and that
wasn't even a small barrier. It was a significant barrier, taking seven
years before IE understood application/xhtml+xml. As for the fact that

"the people that yesterday argued that we all must switch to XHTML,

today argue that HTML5 is the only way to go". Speaking generally (or
specifically w.r.t. MathML)? People had to switch to XHTML to get MathML
-- it wasn't even a matter of choice. C.f. again this very insightful
post on the matter.
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source

So after all these years making the case for something else (XHTML),
what this thread is about is to make <math>...</math> works everywhere,
especially where it still matters the most today, and that is HTML5. As
I indicated, my original take is for <math>...</math> to work as-is --
as we have come to know and enjoy it. But it is obvious that this new
mixing has to be defined somehow, even if we later come to a conclusion
saying that it is an opaque <object>, or a profile of some sort.

But I hope that as further insight is gathered through the
proof-of-concept, it turns out that <math>...</math> is just fine, and
that interoperability issues won't be thrown at an already special niche
technology. While on this, I should stress that tag-soug is possible
anywhere, although this is often not mentioned because the extent is
much different. Well-formed tag-soup (as odd as it sounds...) is
possible, which is why these reddish "invalid-markup" messages sometimes
pop in Gecko's MathML rendering. Such things are left undefined by the
spec. However, in the case of MathML where the markup is generated
automatically by software, there is no particular reason to believe that
these generators will suddenly start to generate an indigestible
tag-soup. So it is not quite realistic to over-emphasize this issue.

MathML already works in XML/XHTML and this proposal is not going to
break that. But there is little else to gain there (as far as MathML in
concerned). Publishers who use XML in their back-end production line can
continue to do what they have been doing.

However, MathML stands to win more (especially individual users) in the
front-end by being in HTML (HTML5 for that matter). This might also
encourage those building HTML authoring tools to consider interfacing
MathML (either with free or commercial plug-ins) because the XML/XHTML
barrier won't be standing right at their face. (On the issue of the
verbosity of MathML, this wouldn't be much of an issue if people didn't
have to stare at the MathML. In fact, when I look at HTML+Javascript+CSS
pages these days, they are also quite cryptic... It is possible to have
invisible/collapsible MathML in an editor interfaced to a plug-in?
Surely for people who have experience building comprehensive editors.
But with the XHTML barrier they can't even chime in...)

I am sure by now that it should be evident that it is XML/XHTML that
stand to lose with MathML enabled in HTML5. Anyway, XHTML doesn't seem
to be going anywhere. (How often does one stumble on a page served as
application/xhtml+xml -- if it isn't a page with MathML?) In any case,
as I indicated, it will still work there, maybe not just as _the_
selling argument that it is now. (Many math pages wouldn't have bothered
with XHTML if it had been possible to have MathML in HTML, and that's
where their loss might come from. But does it really matter? Read Robert
Miner's earlier post again.)

To advance MathML, we contributed a great deal to XML/XHTML and pushed
for them so much that it is very easy to forget the initial focus.
MathML-in-HTML5? Worth a try. The thread is now about the issues in
prototyping this, and the benefits (or otherwise) for MathML and math on
the web. And I must say I don't see that much disadvantages in enabling
MathML everywhere at this point.
---
RBS

Paul Topping

unread,
Oct 3, 2006, 2:27:42 AM10/3/06
to Roger B. Sidje, White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
This all sounds vaguely familiar. When MathML (and Mozilla) were new,
many of us argued for MathML support in Mozilla's HTML parser for many
of the same reasons I see here. We were told by the Mozilla chieftains
that this would only happen over their dead bodies and that XHTML was
the only way we were going to get MathML support. Perhaps it did take us
7 years to get IE to work with a XHTML+MathML but IE has also had a
solution for MathML embedded in HTML for even longer.

While Microsoft may have (nasty) business reasons for not supporting
XHTML, they may also have made the argument that the world wasn't ready
to change all their pages into XHTML just for some gain in "purity".
Sounds like some people on this list are coming around to that same
point of view.

So, as I posted a week ago, why not adopt the Microsoft convention for
embedding MathML (or any other XML language) in HTML? Minus the COM
class id stuff, of course. Basically, this would result in a simple
declaration of the embedded language's namespace. For the reasons stated
earlier, just <math> is not enough. At a minimum, it doesn't allow for
smooth transitions to new versions of MathML. Come on, Microsoft isn't
wrong all the time.

Paul Topping
Design Science

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf

White Lynx

unread,
Oct 3, 2006, 3:18:43 AM10/3/06
to
> Please don't go down that road.
> Let's not have two incompatible markup languages, both called "MathML,"
> one of which can be embedded in HTML5, the other in XHTML.

Completely agree. Personally I am not against removing mandatory tokens
and following approach taken by ECMA (this attitude does not
necessarily reflect the position of Math WG however), but I am
radically against current approach. It does not make sense to remove
tokens from markup while preserving them in DOM (the semantic value of
tokens automatically generated by parser is zero, and not all
conversion/interchange tools operate through DOM).

> I don't know how common the knowledge is, but MathML is closely tied
> with a certain platform: Mathematica

They just use MathML for import/export of math formulae. This is not
the kind of integration I meant.

>> I consider switching from XML to text/html as inappropriate and
>> pointless development, morover it is damaging in long term perspective.

> Damaging to what? To MathML? Not really in my opinion. What damage could
> there be to have plenty of MathML formulas on the web?!?

What prevents you from having plenty of formulae on web today? Do we
have at least one MathML implementation that supports HTML, but lacks
XHTML support? Do we have MathML implementations that support XHTML
only? So, how introducing two different and incompatible parsing rules
will improve interoperability? And assume that you have plenty of
formulae on web and you want to process them. How having half of
them in tagsoup and another half in XML does not make them easier to
handle?

> But to the
> XML/XHTML agenda, possibly. And that has been the real "problem" since
> the beginning, and which I alluded to in my opening post.

It is not the beggining. Seven years passed since that time and a lot
of XML applications emerged since then. Most of current W3C are
designed keeping in mind XML and not SGML or HTML. MathML is part of
large and extensible framework where it can be combined with other XML
applications. Current proposal does adds no new functionality to
MathML, but rather artificially splits MathML community into
incompatible parts that has to be delt separately.

> Interested in using MathML? First pass that XHTML barrier, and that
> wasn't even a small barrier. It was a significant barrier, taking seven
> years before IE understood application/xhtml+xml.

It was. But it is not anymore. So it is not clear what are you struggle
with. Maybe someone has to struggle with legacy text/html content, but
it is not our problem we have no MathML in HTML legacy. Maybe someone
complaints that MSIE does not support application/xhtml+xml, again it
is not our problem as without MathPlayer MSIE can not process MathML
while with MathPlayer application/xhtml+xml problem is N/A.
If someone doubts about future of XML in MSIE, note that Microsoft's
own mathematical markup language is (and most of other recent format$
are) entirely XML based.

> MathML already works in XML/XHTML and this proposal is not going to
> break that.

XML for maths means better interoperability (and extensibility) this
proposal splits MathML into two different versions

> This might also
> encourage those building HTML authoring tools to consider interfacing
> MathML (either with free or commercial plug-ins) because the XML/XHTML
> barrier won't be standing right at their face.

Once again there is no barrier, XHTML has all the functionality that
HTML has and much more. The only issue is MSIE parser and as noted
above several times this issue is N/A to MathML today.

> Many math pages wouldn't have bothered
> with XHTML if it had been possible to have MathML in HTML

Which means that goint in that direction will give rise to two
different versions of MathML, damaging interoperability and introducing
no new functionality.

> MathML-in-HTML5? Worth a try.

Once you try something you can't always untry it. Just proceed with you
proposal and we will have to strugle with text/html legacy forever.

Jacques Distler

unread,
Oct 3, 2006, 9:17:47 AM10/3/06
to
In article <1159859923....@m73g2000cwd.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

>It does not make sense to remove
>tokens from markup while preserving them in DOM (the semantic value of
>tokens automatically generated by parser is zero, and not all
>conversion/interchange tools operate through DOM).

HTML does this all the time. (E.g. inferred <tbody> element as a child
of <table>, inferred <head> and <body> elements,...) There's nothing
wrong with inferred elements ... per se.

The only problem occurs when people expect their MathML code (or, more
pertinently, the software they use to generate it) to be interoperable
in an XML context.

>> This might also
>> encourage those building HTML authoring tools to consider interfacing
>> MathML (either with free or commercial plug-ins) because the XML/XHTML
>> barrier won't be standing right at their face.
>
>Once again there is no barrier, XHTML has all the functionality that
>HTML has and much more.

I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
produce XHTML are rare to nonexistent.

And many users don't have control over the MIME-type their pages are
sent with. If they did, you wouldn't have so many RSS feed sent as
text/XML (which, unless they are plain ASCII, means they are
automatically ill-formed).

>The only issue is MSIE parser and as noted
>above several times this issue is N/A to MathML today.

Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag
soup *TODAY*.

There's every incentive to have the Mozilla people experiment with
allowing Mozilla to do the same.

White Lynx

unread,
Oct 3, 2006, 10:55:21 AM10/3/06
to
> >It does not make sense to remove
> >tokens from markup while preserving them in DOM (the semantic value
of
> >tokens automatically generated by parser is zero, and not all
> >conversion/interchange tools operate through DOM).
>
> HTML does this all the time. (E.g. inferred <tbody> element as a
child
> of <table>, inferred <head> and <body> elements,...) There's nothing

> wrong with inferred elements ... per se.

One thing when you can unamboguously infer completely useless element
that has no semantic value and just groups rows (tbody) and another
thing is when you infer out of nowhere elements with either predefined
presentation or semantic like address, or i.

> The only problem occurs when people expect their MathML code (or,
more
> pertinently, the software they use to generate it) to be
interoperable
> in an XML context.
>
> >> This might also
> >> encourage those building HTML authoring tools to consider
interfacing
> >> MathML (either with free or commercial plug-ins) because the
XML/XHTML
> >> barrier won't be standing right at their face.
>
> >Once again there is no barrier, XHTML has all the functionality
that
> >HTML has and much more.
>
> I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> produce XHTML are rare to nonexistent.

You tend to turn simple things into rocket science.

> And many users don't have control over the MIME-type their pages are

> sent with. If they did, you wouldn't have so many RSS feed sent as
> text/XML (which, unless they are plain ASCII, means they are
> automatically ill-formed).

Browsers follow Appendix F.2 of XML recommendation (if an XML entity
is in a file, the Byte-Order Mark and encoding declaration are used (if
present) to determine the character encoding) not RFC 3023.

>
> >The only issue is MSIE parser and as noted
> >above several times this issue is N/A to MathML today.
>
> Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag

> soup *TODAY*.

Well, MSIE does not deal with MathML in any form and I am not against
embededing MathML in environments other then XML (you can embed it in
LaTeX if you want) but I am against turning it into tagsoup which is
different issue.

White Lynx

unread,
Oct 3, 2006, 11:01:06 AM10/3/06
to
> There's every incentive to have the Mozilla people experiment with
> allowing Mozilla to do the same.

Nobody is against experiments here, just please use your own unique
name (MathML is in use already), your own namespace (if applicable) and
your own content type (that is in case if experiment goes beyound
boundaries of given content type).
Or alternatively make any changes in markup language through channels
provided by organization that developed this markup, defined relevant
namespace and registered the content type.

Jacques Distler

unread,
Oct 3, 2006, 11:25:03 AM10/3/06
to
In article <1159887321....@c28g2000cwb.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

>> HTML does this all the time. (E.g. inferred <tbody> element as a child
>> of <table>, inferred <head> and <body> elements,...) There's nothing
>> wrong with inferred elements ... per se.
>
> One thing when you can unamboguously infer completely useless element
>that has no semantic value

If you can unambiguously infer an element, it matters not a *whit*
whether it is "useful" or "useless."

I could make the same argument about inferred end tags in HTML.
(Inferred elements are just a special case, where both the start and
end tags are optional.)

> > I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> > produce XHTML are rare to nonexistent.
>
> You tend to turn simple things into rocket science.

Writing a CMS that reliably produces well-formed XHTML is "simple"?

You should write one then. The world will thank you.

> > Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag
>
> > soup *TODAY*.
>
>Well, MSIE does not deal with MathML in any form and I am not against
>embededing MathML in environments other then XML (you can embed it in
>LaTeX if you want) but I am against turning it into tagsoup which is
>different issue.

MSIE, with the MathPlayer2 plugin, consumes (well-formed) MathML
fragments embedded in tag-soup ("X")HTML. It does that *TODAY*.

If the idea of MathML in tag soup bothers you, sorry, but it's too
late. That ship has sailed.

David Carlisle

unread,
Oct 3, 2006, 11:29:58 AM10/3/06
to whit...@operamail.com, www-...@w3.org, dev-tec...@lists.mozilla.org

> Well, MSIE does not deal with MathML in any form

This isn't really the case. It's true that if you are using
IE+MathPlayer then the math rendering is being done by an application
produced by Design Science rather than Microsoft, but would you say that
"Opera doesn't deal with applets in any form" just because executing an
applet requires a JDK from sun (or some other Java virtual machine)?
In practice, what a user experiences as "the browser" might be any
number of applications from multiple companies.

IE, for all it's faults, has a rather sensible way of dealing with
extending HTML with XML languages (MathML, SVG, ...). Mozilla, leveraging
off its open source basis, requires the core engine to be extended to
support these languages. IE on the other hand exposes an API that
allows a particular rendering engine to register itself to render
specific XML namespaces. The actual implementation of the idea in IE
unfortunately has some flaws in that it requires explict COM ids being
declared in an object element in the page, and requires a non standard
namespace declaration syntax, However these flaws can be hidden from the
user as long as some guidelines are followed.

> and I am not against
> embededing MathML in environments other then XML (you can embed it in
> LaTeX if you want)

Yes, I once implemented an XML parser in TeX, with that in mind...
http://www.google.co.uk/search?q=xmltex

> but I am against turning it into tagsoup which is
> different issue.

I agree, and this is one of the merits of the IE approach, that I hope
would be seriously considered for mozilla. It isn't necessary for
HTML <4+n> to specify "html-variants" of the various XML languages, _any_
_well formed_ XML fragments can be included, so long as you register the
namespace with the application to bind it to a rendering component. In
IE that binding happens in the html page itself, but it would be better
done at the browser level.

I think that if a simpler linear input form without so much element
markup overhead is required, (and almost certainly it is required)
then something more like
http://www1.chapman.edu/~jipsen/mathml/asciimath.html
is what is wanted (ie, no element markup at all). Asciimath as published
at the above address does the expansion to MathML on the client (so it
is the tex-like syntax that would be served) but an alternative would be
to do the expansions on the server, which is essentially the wiki
approach, allowing you to write 1+x^2 as shorthand for
<mn>1</mn><mo>+</mo>...
just as
* zzz
is shorthand for <ul><li>zzz... in many wiki variants.

David


Jacques Distler

unread,
Oct 3, 2006, 11:56:26 AM10/3/06
to
In article
<mailman.6561.115988941...@lists.mozilla.org>, David
Carlisle <dav...@nag.co.uk> wrote:


>I think that if a simpler linear input form without so much element
>markup overhead is required, (and almost certainly it is required)
>then something more like
>http://www1.chapman.edu/~jipsen/mathml/asciimath.html
>is what is wanted (ie, no element markup at all). Asciimath as published
>at the above address does the expansion to MathML on the client (so it
>is the tex-like syntax that would be served) but an alternative would be
>to do the expansions on the server,

Whether one uses a Wiki-like syntax, or a tex-like syntax, and whether
the expansion to MathML is done server-side (as, say, in blahtex or
itex2MML), or client-side (as in asciimath), it is simply the case that
*no one* hand-authors MathML in a production environment. Hence, I
agree, that the verbosity of MathML is a non-issue.

>The actual implementation of the idea in IE unfortunately has some flaws
>in that it requires explict COM ids being declared in an object element
>in the page, and requires a non standard namespace declaration syntax,
>However these flaws can be hidden from the user as long as some guidelines
>are followed.

With MathPlayer2, these flaws are hidden from the author as well.
IE+MathPlayer2 can consume bog-standard XHTML+MathML documents. (With
the proviso that the "XHTML" is actually treated as tag-soup.)

I also happen to like the IE approach, which requires well-formed
MathML. Since the MathML content (like most SVG content) is produced by
automated tools, it is not too much of a burden to demand (or expect)
that it be well-formed.

William F Hammond

unread,
Oct 3, 2006, 12:30:33 PM10/3/06
to dev-tec...@lists.mozilla.org
Jacques Distler <dis...@golem.ph.utexas.edu> writes:

> In article <1159887321....@c28g2000cwb.googlegroups.com>,
> White Lynx <whit...@operamail.com> wrote:
>
>>> HTML does this all the time. (E.g. inferred <tbody> element as a child
>>> of <table>, inferred <head> and <body> elements,...) There's nothing
>>> wrong with inferred elements ... per se.
>>
>> One thing when you can unamboguously infer completely useless element
>>that has no semantic value
>
> If you can unambiguously infer an element, it matters not a *whit*
> whether it is "useful" or "useless."
>
> I could make the same argument about inferred end tags in HTML.
> (Inferred elements are just a special case, where both the start and
> end tags are optional.)

As I understand it, <tbody> is off point in relation to White Lynx's
concern.

I think he was originally speaking against having entity names like
&dagger; available in a user agent's DOM while they are formally
excluded from an author's content as shipped through the web.

Named or not it's CDATA, and I think it's something of a house of
cards to be making item-by-item decisions on bits of CDATA. It's a
covert resurrection of SDATA. How can one hope for consistency across
various user agents?

>>Well, MSIE does not deal with MathML in any form and I am not against
>>embededing MathML in environments other then XML (you can embed it in
>>LaTeX if you want) but I am against turning it into tagsoup which is
>>different issue.
>
> MSIE, with the MathPlayer2 plugin, consumes (well-formed) MathML
> fragments embedded in tag-soup ("X")HTML. It does that *TODAY*.

I think the idea of an "Appendix C" profile that was mentioned, I
believe, by Jacques Distler is meritorious. I also think it
consistent with what Roger Sidje has said.

I can understand why XML namespace prefixes would be problematical in
HTML 5, but I see no harm allowing suitably profiled content that
would be valid XHTML+MathML when served as "application/xhtml+xml" to
sail also as HTML 5 IF the current discussion makes any sense at all.
In particular, Ian Hixie previously said that xmlns attribute settings
would be a conformance violation. But as I read the current whatwg
spec it would be a violation only of the third kind, and I find that
third clause, along with its table example, not persuasive.

> If the idea of MathML in tag soup bothers you, sorry, but it's too
> late. That ship has sailed.

Oh? Your content? Does Mozilla handle it? Where can we see it?

-- Bill

Juan R.

unread,
Oct 3, 2006, 12:50:48 PM10/3/06
to
Chris Chiasson wrote:
> White Lynx wrote:
> >All other approaches are closesly
> > integrated in some extensible framework with powerful formatting
> > mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).
>
> I don't know how common the knowledge is, but MathML is closely tied
> with a certain platform: Mathematica
>
> Wolfram Research (makers of Mathematica) was one of the originators of
> MathML. Anyway, present day MathML is strongly related to Mathematica's
> internal representation of math, as shown in this short example.

It is interesting that Nov, 1995 Wolfram Research draft for Math on the
web was never approved. The final Apr, 1998 MathML W3C recommendation,
of course, is not completely unrelated to early Wolfram draft, but is
not the same, somewhat as MathML is not ISO-12083 or TeX even if there
exist some similarities.

> Consider Euler's formula as entered in the most source-code like syntax
> available in Mathematica (called InputForm):
>
> E^(I*x)==Cos[x]+I*Sin[x]
>
> After parsing, it becoms this (called FullForm):
>
> Equal[Power[E,Times[Complex[0,1],x]],Plus[Cos[x],Times[Complex[0,1],Sin[x]]]]
>
> Compare this with content MathML
>
> <math
> xmlns='http://www.w3.org/1998/Math/MathML'><apply><eq/><apply><power/><
> exponentiale/><apply><times/><imaginaryi/><ci>x</ci></apply></apply><apply><
> plus/><apply><cos/><ci>x</ci></apply><apply><times/><imaginaryi/><apply><sin/>
> <ci>x</ci></apply></apply></apply></apply></math>
>
> Notice how <apply> is used to capture the structure of
> head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.

The first you write is a M expression. The second is a xml encoding of
a S expression. They are two different concepts even if you can
transform between both. Moreover, take the LISP/Scheme representation

(head arg1 arg2 arg3).

where the ( ) indicates an application to be evaluated. What is more
close to c-MathML? Lisp or Mathematica?

What is more, so far as i know Mathematica uses M expressions just at
the syntax level not as internal representation.

> So those plentiful <mrow> elements shouldn't be unexpected. Also, it
> becomes pretty apparent why presentation MathML is nearly
> incomprehensible. It is a representation of an already verbose two
> dimensional box formatting system in XML, making it even more verbose.

Are you claiming that the best way to improve comprehensibility whereas
decreasing verbosity of "those plentiful <mrow> elements" may be
promoting a new syntax (Ian's syntax) maintaining just all the mrows
there?

XML was really designed for documents not raw data. MathML is poor
still because data is defined at the token level. It was not needed to
be a genious for computing the order of magnitude on file oversize from
such one approach. Consequences would have been computed _before_
implementing MathML in a native way. Microsoft (embbedded islands more
plugin) and Opera (CSS + JS) movements were much more intelligent.

> IMHE (in my humble estimation) Firefox people would be better off
> trying to define "shorthand" definitions for the content MathML system,
> which WRI will be less likely to oppose.

Does this sense that if FF (Mozilla) do not support content MathML in
either native or plugin way?

Juan R.

unread,
Oct 3, 2006, 1:08:36 PM10/3/06
to
Jacques Distler wrote:
> In article <1159887321....@c28g2000cwb.googlegroups.com>,
> White Lynx <whit...@operamail.com> wrote:
>
> > > I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> > > produce XHTML are rare to nonexistent.
> >
> > You tend to turn simple things into rocket science.
>
> Writing a CMS that reliably produces well-formed XHTML is "simple"?
>
> You should write one then. The world will thank you.
>

I agree that the production of XHTML (even strict) is not rocket
science.

With MSIE does not supporting XHTML and Mozilla implementation really
sucking (even Mozilla guys recommend the use of HTML before using XHTML
when you are not benefiting from other XML applications: MathML, SVG,
etc.) there is not commercial interest for first-class XHTML tools and
most of developers simply adapted their previous HTML presentational
algoritms to the X hype.

What is the benefit to write nice XHTML tools for science if after
people as you would introduce all kind of crazy code (incorrect
rendering, extra mrows collapsing Mozilla engine, numbers splinted at
the decimal point, ds^2 being encoded as 2s ds...) when using your
inefficient IteX plugin on the Internet?

However, there exist a couple of XHTML tools generating good code (even
a few already can generate pure strict code) and tools generating very
good MathML code and next year we could -maybe- see to Word generating
XHTML for blogs (it appears that strict code W3C validated is in their
target).

Juan R.

unread,
Oct 3, 2006, 1:18:11 PM10/3/06
to

Jacques Distler wrote:
>
> Whether one uses a Wiki-like syntax, or a tex-like syntax, and whether
> the expansion to MathML is done server-side (as, say, in blahtex or
> itex2MML), or client-side (as in asciimath), it is simply the case that
> *no one* hand-authors MathML in a production environment. Hence, I
> agree, that the verbosity of MathML is a non-issue.

One would not confound asciimath with asciimathJS. Contrary to
itex2MML, asciimath works both at client and server side. Look at the
PHP version

[http://www.jcphysics.com/ASCIIMath/]

Other versions could be developed at the server side.

Since this stuff is already available for years, there is not need for
Ian's mixed syntax, which is still a order of magnitude more verbose
than asciimath, itex, latex... and therefore will remain unpopular.

White Lynx

unread,
Oct 3, 2006, 1:28:36 PM10/3/06
to
> I think he was originally speaking against having entity names like
> &dagger; available in a user agent's DOM while they are formally
> excluded from an author's content as shipped through the web.

No I meant mi, mo, mn token elements.

>> HTML does this all the time. (E.g. inferred <tbody> element as a
child
> >> of <table>, inferred <head> and <body> elements,...) There's
nothing
> >> wrong with inferred elements ... per se.
>
> > One thing when you can unamboguously infer completely useless
element
> >that has no semantic value
>
> If you can unambiguously infer an element, it matters not a *whit*
> whether it is "useful" or "useless."

If. And if so they would not be introduced at all.

>
> I could make the same argument about inferred end tags in HTML.
> (Inferred elements are just a special case, where both the start and

> end tags are optional.)
>


> > > I disagree strongly. XHTML is a *huge* barrier. CMS's that
reliably
> > > produce XHTML are rare to nonexistent.
>
> > You tend to turn simple things into rocket science.
>
> Writing a CMS that reliably produces well-formed XHTML is "simple"?
>
> You should write one then. The world will thank you.

I don't need it (neither CVS nor thanks).

> If the idea of MathML in tag soup bothers you, sorry, but it's too
> late. That ship has sailed.

It is not my fault.

Jacques Distler

unread,
Oct 3, 2006, 2:48:35 PM10/3/06
to
In article <1159896516.3...@i3g2000cwc.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

> No I meant mi, mo, mn token elements.
>
>> HTML does this all the time. (E.g. inferred <tbody> element as a child
>> of <table>, inferred <head> and <body> elements,...) There's nothing
>> wrong with inferred elements ... per se.
>>
>>> One thing when you can unamboguously infer completely useless element
>>>that has no semantic value
>>
>> If you can unambiguously infer an element, it matters not a *whit*
>> whether it is "useful" or "useless."
>
> If. And if so they would not be introduced at all.

If one CAN'T unambiguously infer <mo>, <mi> and <mn> elements, then
there's no point in arguing whether it's a good idea to make them
optional. They would *necessarily* (by the design criteria of the
proposal) be required elements.

The only case to argue is *if* (I haven't thought about it, so I'll
assume Ian is correct) they *can* be unabiguously inferred, whether it
is a good idea to do so.

I have argued that it is *not*, but for reasons that have nothing to do
with the alleged semantic value (or lack thereof) of these elements.

In any case, if one followed my proposal of creating a
tag-soup-parser-safe profile of MathML, then there would be no
discussion here. These are required elements in MathML (they are not
inferred); ergo, they would be required elements in any profile of
MathML.

William F Hammond

unread,
Oct 3, 2006, 4:25:44 PM10/3/06
to dev-tec...@lists.mozilla.org
Jacques Distler <dis...@golem.ph.utexas.edu> writes:

> If one CAN'T unambiguously infer <mo>, <mi> and <mn> elements, then
> there's no point in arguing whether it's a good idea to make them
> optional. They would *necessarily* (by the design criteria of the
> proposal) be required elements.
>
> The only case to argue is *if* (I haven't thought about it, so I'll
> assume Ian is correct) they *can* be unabiguously inferred, whether it
> is a good idea to do so.
>
> I have argued that it is *not*, but for reasons that have nothing to do
> with the alleged semantic value (or lack thereof) of these elements.
>
> In any case, if one followed my proposal of creating a
> tag-soup-parser-safe profile of MathML, then there would be no
> discussion here. These are required elements in MathML (they are not
> inferred); ergo, they would be required elements in any profile of
> MathML.

Of course, you are right that inference is not generally possible.

When <math> in HTML5 is to mean, as suggested by Roger Sidje, that the
content is MathML, then its content should be correct MathML -- which
is why the <math> opentag should then bear an xmlns attribute even
though there would be no general XML namespace understanding of
"xmlns" across elements appearing in HTML5.

If HTML5 is going to be reasonable, user agents should have provision
for the special case where a whole HTML5 instance between <html>
and </html> is a valid XHTML or XHTML+MathML instance subject to
Appendix C type profiling rules (which would ban things like xml
namespace prefixes and ask for things like <mspace /> rather than
<mspace/>. But <mi>, <mn>, and <mo> should be mandatory.

However, I think in this discussion and the discussion at whatwg I've
seen the germ of a further idea for a more casual kind of math that
would be reasonable for human authoring. In that situation it would
be reasonable for TeX default handling of symbols to apply, e.g..,
<mi>Hom</mi>(X, Y), <mi>cos</mi> ax <mi>sin</mi> bx -- the point being
that by default every loose character represents a symbol, and strings
of length > 1 are symbols only when enclosed in <mi> or something like
it. I have no idea, however, as to whether the advocates of HTML5 are
prepared to render this more casual kind of markup.

-- Bill

Ian Hickson

unread,
Oct 3, 2006, 5:56:57 PM10/3/06
to David Carlisle, www-...@w3.org, dev-tec...@lists.mozilla.org, whit...@operamail.com
On Tue, 3 Oct 2006, David Carlisle wrote:
>
> I agree, and this is one of the merits of the IE approach, that I hope
> would be seriously considered for mozilla. It isn't necessary for HTML
> <4+n> to specify "html-variants" of the various XML languages, _any_
> _well formed_ XML fragments can be included, so long as you register the
> namespace with the application to bind it to a rendering component.

What are the rules for handling non-well-formed content? (Could you show
me an example of this? Different people seem to mean different things
when they talk about IE's extension models.)

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Ian Hickson

unread,
Oct 3, 2006, 6:19:46 PM10/3/06
to Bruce Miller, www-...@w3.org, dev-tec...@lists.mozilla.org
On Tue, 3 Oct 2006, Bruce Miller wrote:
>
> In reference to "CMS" not reliably support XHTML, (depending on exactly
> what is meant by "CMS"), most web tools[**] seem unable to reliably
> generate (conformant) HTML either -- in the current web, there's little
> motivation. Again, it's hard to see that HTML5 will improve that.

HTML5 will not improve people's authoring skills. However, it will
(hopefully, at least!) improve the interoperability of UAs when handling
broken pages -- with HTML5 we no longer have "tag soup", because every
stream of input characters maps to a single well-defined DOM. There's no
more guesswork involved.

Roger B. Sidje

unread,
Oct 3, 2006, 6:27:03 PM10/3/06