Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

MathML-in-HTML5

60 views
Skip to first unread message

r...@maths.uq.edu.au

unread,
Sep 23, 2006, 9:57:55 AM9/23/06
to dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
I am currently driving an effort to enable MathML-in-HTML (apart from
MathML-in-XHTML that we already support). I have a patch that serves
the dual purpose of showing where things are going and the issues to
ponder about.

Here is a
[screenshot] https://bugzilla.mozilla.org/attachment.cgi?id=239771
which is a _live_ rendering of this testcase:
[mathml-in-html] https://bugzilla.mozilla.org/attachment.cgi?id=239769

Those interested in following this up can see bug 353926:
https://bugzilla.mozilla.org/show_bug.cgi?id=353926

Quick background:
=================

At the Firefox engineering meeting in Mountain Views (last December
2005), I pleaded that we enable MathML in HTML5 to advance the cause
of MathML, which is so far locked in a XHTML/XML world that does not
seem to be going anywhere in terms of display content as opposed to
data (witness the WHATWG effort -- http://www.whatwg.org). Those to
whom I spoke included dbaron, hixie and sicking, and they welcomed the
suggestion, asking for a broader discussion. Hixie raised the caveat
that MathML elements should still remain in the MathML namespace. He
e-mailed me a while ago about a discussion on this matter in the
WHATWG mailing list, which can be seen here
http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-June/thread.html.

That discussion is however too broad and involves tangential issues such as
inventing another syntax, etc. My original take was simply to enable
MathML+HTML, in the same vein as we have MathML+XHTML. I think MathML
is suffering from having to fight the battle for adoption of XHTML as
well. As a niche technology, it does not have the means to be engaging
a fight. What it simply needs is MathML-in-HTML. W3C failed to
recognise that it could retrofit MathML in HTML -- see this archived
post for some insight:
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source
But HTML5 being shepherded by WHATWG could provide the right framework
from this to happen now.

I have finally been able to code this up (while keeping MathML
elements in the MathML namespace). I attached the patch I had so far
in bug 353926.

Design & Technical issues:
==========================

How does MathML-in-HTML5 work?

We support MathML-in-HTML5 when these two conditions are met:

1. The DOCTYPE of the document says so. If yes, we enable
MathML entities (TODO) and flag mMayHaveMathML in the HTML content sink.

2. And either a) OR b) is met:

a) <html> has the MathML namespace as the value of an attribute with a
prefix, e.g., <html xmlns:m="http://www.w3.org/1998/Math/MathML">.

In this case, we cache the prefix "m" in mMathMLNameSpacePrefix,
and we intercept all <m:tag> in the document and create
MathML content nodes for them.

b) MathML fragments are in the document as
<math xmlns="http://www.w3.org/1998/Math/MathML">
...
</math>

In this case, we intercept all non-HTML elements inside the <math> tag
and create MathML content nodes for them.

Issues:
1. Tag soup: we understand that we are exposing ourselves to this.

2. a) What about CSS matching rules? From the Style System point of view,
the document is still HTML, but <m:math> is in the MathML namespace. We
might have to special case MathML-in-HTML5 in the Style System as well.

b) The second option raises an issue with HTML-in-MathML, e.g.,
<math xmlns="http://www.w3.org/1998/Math/MathML">
<b>bold</b>
</math>
We don't intercept the <b> in this case. Hence, even though it is
HTML-in-MathML without an explicit XHTML namespace for <b>,
the HTML sink
will give <b> a HTML content node. This is not really XHTML friendly.
On the other hand, we don't want to be an XML parser either... These
are conflicting objectives. We need to decide what to do. We may agree
to only support tags with prefixes as in a), or also keep b) knowing
that it has this XHTML unfriendly behavior.
---
RBS

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


Ian Hickson

unread,
Sep 23, 2006, 5:06:07 PM9/23/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Sat, 23 Sep 2006 r...@maths.uq.edu.au wrote:
>
> Hixie raised the caveat that MathML elements should still remain in the
> MathML namespace.

I meant in the DOM, I didn't mean in the markup. I don't think we should
have any namespace declarations or namespace prefixes in text/html; I
would just have the HTML parser always support the MathML elements, in
the same way that it supports any random unknown element today, except
that when it sees a MathML element it puts it into the MathML namespace in
the DOM rather than the XHTML namespace.

I really don't think we want to introduce namespace prefixes or namespace
declarations into tag soup. I think that would be a big mistake.

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Paul Topping

unread,
Sep 23, 2006, 6:38:52 PM9/23/06
to Ian Hickson, r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
If MathML is considered a subset of HTML5, then no namespace declaration
would be necessary. However, if MathML is going to work in HTML that
isn't declared as HTML5 (not clear to me from this thread), then the
document would be poorly specified without it, IMHO.

At the risk of enciting an anti-Microsoft backlash, I should remind some
on the list that IE has covered this territory before. They already have
a mechanism for declaring XML islands in HTML that seems to work just
fine. Of course, Mozilla won't be interested in duplicating IE's way of
associating a plugin as the renderer of the namespace in the document.
IMHO, it doesn't belong there anyway. It is better (ie, more secure) to
keep such associations out of the content.

Paul Topping
Design Science, Inc.
www.dessci.com/mathplayer

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>

Ian Hickson

unread,
Sep 23, 2006, 8:08:52 PM9/23/06
to Paul Topping, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org, r...@maths.uq.edu.au
On Sat, 23 Sep 2006, Paul Topping wrote:
>
> If MathML is considered a subset of HTML5, then no namespace declaration
> would be necessary. However, if MathML is going to work in HTML that
> isn't declared as HTML5 (not clear to me from this thread), then the
> document would be poorly specified without it, IMHO.

As far as HTML5 UAs are concerned, declaring HTML as HTML5 consists of
labelling it as text/html. It isn't clear to me what you would consider
HTML that isn't declared as HTML5. With the exception of quirks which are
required for compatibility with de facto standards that disagree with de
jure standards, HTML has no practical versioning story -- all features
work in all documents, regardless of the official "version" of HTML used.


> At the risk of enciting an anti-Microsoft backlash, I should remind some
> on the list that IE has covered this territory before. They already have
> a mechanism for declaring XML islands in HTML that seems to work just
> fine.

XML data islands don't form part of the parent DOM (they are "islands", as
opposed to part of the document). I'm not sure how wrapping <xml> tags
around the MathML content would help. :-)


> And, I should have added that without a namespace declaration there
> would be no way to differentiate different versions of MathML. While
> most MathML instances are now MathML 2.0, the MathML 3.0 effort is just
> now starting up.

Why would you need to distinguish them? MathML2 is a superset of MathML1,
and (for all intents and purposes) any compliant MathML2 UA can process
any compliant MathML1 content. I would assume that this would continue to
be the case; if not, then this is IMHO a problem with MathML3.

Note that the namespace declaration can't currently distinguish between
MathML1 and MathML2, I don't see any reason why MathML3 would change this.

Chris Chiasson

unread,
Sep 24, 2006, 4:58:59 AM9/24/06
to
I don't understand. Aren't people who are savvy enough to generate
MathML also savvy enough to generate XHTML? Has anyone actually said,
"That MathML I can handle, but what's this XHTML?"

sha...@shantirao.com

unread,
Sep 24, 2006, 12:16:13 PM9/24/06
to
On 9/24/2006 1:58 AM, Chris Chiasson wrote:
> I don't understand. Aren't people who are savvy enough to generate
> MathML also savvy enough to generate XHTML? Has anyone actually said,
> "That MathML I can handle, but what's this XHTML?"

Savvy, yes. But also impatient. You will notice that HTML is what gets
used -- not XHTML.

I like write straight HTML with embedded LaTeX, then run it through a
translator to turn $exponents^2$ into MathML. Sure, HTML->XHTML
converters exist, but again, I'm lazy, selfish, and impatient.

Shanti

Chris Chiasson

unread,
Sep 24, 2006, 1:43:39 PM9/24/06
to
You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
What kind of tools are you using?

David Carlisle

unread,
Sep 25, 2006, 5:11:17 AM9/25/06
to i...@hixie.ch, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org

Ian

> XML data islands don't form part of the parent DOM (they are "islands", as
> opposed to part of the document). I'm not sure how wrapping <xml> tags
> around the MathML content would help. :-)

The syntax Paul was referring to here wasn't the <xml> convention, but
the ability in IE to have (explicitly prefixed) XML elements within an
HTML document with rendering controlled by an external component,
but _without_ any other flag at that point in the in the markup, such as
<xml> or <object> etc.

In the IE implementation you need to have an <object> in the head
pointing at the particular rendering component, which is fairly horrible
and also, you need to declare the namespace using (a variant of) an
early working draft namespace syntax using a PI, but as Paul said, those
parts needn't be copied. an example of a document using this syntax is
shown here:

http://www.dessci.com/en/products/mathplayer/author/creatingpages.htm#AnatomyMathPlayerWebPage

By using a different classid you can do the same thing to include
(explicitly prefixed) svg into an htm document and have it rendered by
Adobe's svg viewer, and in principle any other vocabularies (although I
don't personally know of any other implementations of this, except
techexplorer, which is again for MathML).

I'm not sure, having math more or less added directly to html would be
nice in many ways but I'm not sure how well it scales, if you think
people might want to have html+svg+chemml+... then perhaps having an api
that allows processing to be attached to namespaced elements would be
more general. On the other hand that was part of the reason for having
namespaces (and for that matter, xml itself) that people could serve all
sorts of different xml vocabularies and have clients do whatever is
necessary. I suspect part of the reason for "html5" is a feeling that
that never happened and isn't going to be mainstream any time soon, and
that a solution that directly addresses the fixed html vocabulary, with
perhaps two specific extensions such as svg and mathml will in practice
cover the vast majority of browser needs, and other vocabularies can be
transformed to html+.. before being served.

David

Ian Hickson

unread,
Sep 25, 2006, 1:38:59 PM9/25/06
to David Carlisle, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Mon, 25 Sep 2006, David Carlisle wrote:
>
> The syntax Paul was referring to here wasn't the <xml> convention, but
> the ability in IE to have (explicitly prefixed) XML elements within an
> HTML document with rendering controlled by an external component, but
> _without_ any other flag at that point in the in the markup, such as
> <xml> or <object> etc.

Oh, well, as noted earlier, the idea of namespace prefixes in HTML isn't
one that I personally am particularly fond of.


> I suspect part of the reason for "html5" is a feeling that that never
> happened and isn't going to be mainstream any time soon, and that a
> solution that directly addresses the fixed html vocabulary, with perhaps
> two specific extensions such as svg and mathml will in practice cover
> the vast majority of browser needs, and other vocabularies can be
> transformed to html+.. before being served.

I think that's pretty much exactly correct, yes.

sha...@shantirao.com

unread,
Sep 25, 2006, 10:44:15 PM9/25/06
to
On 9/24/2006 10:43 AM, Chris Chiasson wrote:
> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
> What kind of tools are you using?

A text editor, and itexMML, of course! Sure, more sophisticated tools
exist, but they aren't very reliable, are they?

Chris Chiasson

unread,
Sep 26, 2006, 8:00:53 AM9/26/06
to
How would transforming XHTML+LaTeX be harder than HTML+LaTeX with
itexMML?

William F Hammond

unread,
Sep 26, 2006, 5:35:39 PM9/26/06
to dev-tec...@lists.mozilla.org
sha...@shantirao.com writes:

> On 9/24/2006 10:43 AM, Chris Chiasson wrote:
>> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX?
>> What kind of tools are you using?
>
> A text editor, and itexMML, of course! Sure, more sophisticated tools
> exist, but they aren't very reliable, are they?

Oh?

I expect that the author of whatever more sophisticated tool you try
would like to hear of any lack of reliability you find.

Cheers.

-- Bill

Roger B. Sidje

unread,
Sep 26, 2006, 7:59:50 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

I don't like mlabeledtr very much (I have already expressed my views
about it to folks of the MathML WG), and would hope that they will take
my suggestion for <mtr label="..."> in MathML3. The former is
unnecessarily bloated and doesn't degrade gracefully at all with
renderers that don't support it (not to mention that it is hard to fit
in Gecko's existing table code).

However, your list misses some key tags, in particular leaf tags such as
<mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
<none/> are needed in <mmultiscripts> (albeit it can be argued that
<none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
differentiation is worthwhile).

In general, I would prefer the list to at least include all the tags
that we already support, and which existing webpages have come to depend
on. This effectively boils down to your list above, excluding
<mlabeledtr>, and including <mspace/>, <mprescripts/>, <none/> and
<mi>, <mn>, <ms>, <mtext>, <mo>. In particular, <mo> is a vital tag as
it is at the heart of those stretchy MathML characters.

Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
beginning of tag soup, it may be that the HTML parser would have to have
some knowledge of leaf tags, so that for example, a stray <mspace>
doesn't become the root of an entire HTML tree... which is later fed to
the hapless MathML engine. (The patch I attached in bug 353926 ignored
the issue.)
---
RBS

On 26/09/2006 3:59 AM, Ian Hickson wrote:
> On Sun, 24 Sep 2006, Boris Zbarsky wrote:
>
>>Ian Hickson wrote:
>>
>>>We didn't check that <canvas> wouldn't cause clashes, either.
>>
>>I see. I had assumed that we in fact had.
>>
>>
>>>I don't see why. We don't want a flag for when people can use the storage
>>>APIs. Or when they can use <img> elements. Or whatever.
>>
>>True, because those are very unlikely to collide with random stuff the pages
>>are doing (e.g. the storage APIs are using fairly long names that are unlikely
>>to collide with page-defined functions and variables).
>>
>>If we think MathML has a similarly low risk of collision, great.
>
>
> I don't know about "we".
>
> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction
>
> ...and of those only <math> came up at in the top 1000 elements in my
> search of elements on about one billion pages.
>
> According to that same research, <math> is, on the Web, less frequent than
> the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>,
> <lable>, <text-spez>, etc. It was present on less than 0.002% of the pages
> the research covered. (To give an idea of scale, <h8> is used on more than
> 0.003%, so if we avoid <math> because of this, we should probably
> introduce <h7> and <h8> into HTML, since we're saying that's an important
> enough level to worry about.)
>
> Now, of course, it could be that those 0.002% of pages are all hugely
> important and that we'll break the Web in adding this feature. We can't
> know until we've tried.
>

Ian Hickson

unread,
Sep 26, 2006, 8:16:07 PM9/26/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG), and would hope that they will take
> my suggestion for <mtr label="..."> in MathML3. The former is
> unnecessarily bloated and doesn't degrade gracefully at all with
> renderers that don't support it (not to mention that it is hard to fit
> in Gecko's existing table code).

I'm happy to drop/add any tag to this list. Just give me the list you
want.


> However, your list misses some key tags, in particular leaf tags such as
> <mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and
> <none/> are needed in <mmultiscripts> (albeit it can be argued that
> <none/> is the same as <mrow></mrow> or an empty <mspace/>, but the
> differentiation is worthwhile).

I missed anything that wasn't in the table I happened upon in the spec. I
didn't look very closely for the exact table I wanted.

Tell me what tags you want to have and we'll make that the list. You're
the expert. :-)


> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the
> beginning of tag soup, it may be that the HTML parser would have to have
> some knowledge of leaf tags, so that for example, a stray <mspace>
> doesn't become the root of an entire HTML tree... which is later fed to
> the hapless MathML engine. (The patch I attached in bug 353926 ignored
> the issue.)

Don't worry, these tags auto-close when a parent tag is closed.

<foo><bar><baz></foo><quux>

...results in this DOM:

<foo>
<bar>
<baz>
<quux>

For leaf nodes with following siblings, people will have to use end tags,
as in:

<foo><bar></bar><baz></baz></foo><quux></quux>

If we want to start adding actual leaf tags, I'd rather do this in a
second stage, after we have a proof of concept. (I've so far avoided
adding any new tags to the HTML5 parser spec, but eventually there will be
a bunch we have to add.)

We can go from non-empty to empty much more easily than from empty to
non-empty.

Roger B. Sidje

unread,
Sep 26, 2006, 9:03:25 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On 27/09/2006 10:16 AM, Ian Hickson wrote:

> I'm happy to drop/add any tag to this list. Just give me the list you
> want.

OK.

> For leaf nodes with following siblings, people will have to use end tags,
> as in:
>
> <foo><bar></bar><baz></baz></foo><quux></quux>
>
> If we want to start adding actual leaf tags, I'd rather do this in a
> second stage, after we have a proof of concept. (I've so far avoided
> adding any new tags to the HTML5 parser spec, but eventually there will be
> a bunch we have to add.)

OK, I see.

The other issue are those 2000 entities that MathML has. You said that
you are not a big fan of a namespace thingy on the root <html> element.

Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
W3C entities _by default_? We have a proof-of-concept of that in View
Selection Source, BTW. It will display any entity it can.
http://lxr.mozilla.org/mozilla/source/content/base/public/nsIDocumentEncoder.idl#125
As VSS has underwent the test of time without major complaints, perhaps
<!DOCTYPE html> could assume that too? If that is agreed, we are all clear.

The other remaining issue might be with style matching because <math>
will then be internally in the MathML namespace whereas the HTML
document is in the none namespace (at present), but we will see how it
goes from there.
---
RBS

Ian Hickson

unread,
Sep 26, 2006, 9:23:38 PM9/26/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
>
> The other issue are those 2000 entities that MathML has.

Yeah... Do we really need those? Some of them seem reasonable to add, but
2000 seems like too many for the mnemonic advantage to beat just using
Unicode codepoints...

The problem with adding entities is that a LOT of people do things like

href="/u?aa=foo&ab=foo&ac=foo&ad=foo"

...which today works, but would break if MathML entities were introduced
(since &ac is a MathML entity).


> Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
> W3C entities _by default_?

Don't do anything based on the DOCTYPE. HTML5 is anything sent as
text/html.


> The other remaining issue might be with style matching because <math>
> will then be internally in the MathML namespace whereas the HTML
> document is in the none namespace (at present), but we will see how it
> goes from there.

I don't see why this would cause any problems.

Roger B. Sidje

unread,
Sep 26, 2006, 11:10:17 PM9/26/06
to Ian Hickson, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On 27/09/2006 11:23 AM, Ian Hickson wrote:
>
> The problem with adding entities is that a LOT of people do things like
>
> href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
>
> ...which today works, but would break if MathML entities were introduced
> (since &ac is a MathML entity).
>

That list is so big that trying to hand-pick some and leaving some out
would need another committee...

>>Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all
>>W3C entities _by default_?
>
>
> Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> text/html.

I thought the DOCTYPE was trustworthy -- based on this excerpt from the
HTML5 spec:

"HTML documents that use the new features described in this
specification must start with the string <!DOCTYPE html> and, if they
are served over the wire (e.g. by HTTP) must be labelled with the
text/html MIME type."

If so, it would have meant less conflicts with agreed entities in HTML5.

BTW, for my own information, do you intent HTML5 to be transitional,
almost-standards, or strict? If it is HTML5 (or XHTML5) served as
text/html but put in the XHTML namespace at some later stage (as the
HTML5 implies), it better be strict, no? And that would be driven by the
DOCTYPE detection code. Catch my drift? Or is tag soup going to be in
the XHTML namespace?

If it is strict then maybe entities could be required to have a
semi-colon -- which will then avoid the ambiguities you mentioned above.

Not that I have a position on this (at least as yet). I am just bringing
in some food for thoughts, to accommodate the realistic issues of MathML.
---
RBS

Ian Hickson

unread,
Sep 27, 2006, 1:59:04 AM9/27/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, Roger B. Sidje wrote:
> On 27/09/2006 11:23 AM, Ian Hickson wrote:
> >
> > The problem with adding entities is that a LOT of people do things
> > like
> >
> > href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
> >
> > ...which today works, but would break if MathML entities were
> > introduced (since &ac is a MathML entity).
>
> That list is so big that trying to hand-pick some and leaving some out
> would need another committee...

Not really... I say we just add ApplyFunction, InvisibleComma, and
InvisibleTimes (but not their short aliases).


> > > Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting
> > > all W3C entities _by default_?
> >
> > Don't do anything based on the DOCTYPE. HTML5 is anything sent as
> > text/html.
>
> I thought the DOCTYPE was trustworthy -- based on this excerpt from the
> HTML5 spec:
>
> "HTML documents that use the new features described in this
> specification must start with the string <!DOCTYPE html> and, if they
> are served over the wire (e.g. by HTTP) must be labelled with the
> text/html MIME type."

That's an authoring conformance requirement, and has no bearing on
implementations.


> BTW, for my own information, do you intent HTML5 to be transitional,
> almost-standards, or strict?

HTML5 documents starting with <!DOCTYPE HTML> must be in standards mode.
Documents with other DOCTYPEs or no DOCTYPE at all may be in another mode,
as already described in the spec. In due course I may specify quirks mode
and then there'll just be the spec, and no other modes.


> If it is HTML5 (or XHTML5) served as text/html but put in the XHTML
> namespace at some later stage (as the HTML5 implies), it better be
> strict, no? And that would be driven by the DOCTYPE detection code.
> Catch my drift? Or is tag soup going to be in the XHTML namespace?

Not sure what you mean my that. All HTML DOM nodes are (per HTML5) in the
XHTML namespace, irrespective of the standards/quirks thing.


> If it is strict then maybe entities could be required to have a
> semi-colon -- which will then avoid the ambiguities you mentioned above.

That would break back-compat.

William F Hammond

unread,
Sep 27, 2006, 12:25:11 PM9/27/06
to dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
Ian Hickson <i...@hixie.ch> writes:

> On Wed, 27 Sep 2006, Roger B. Sidje wrote:

> . . .


>> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the

>> beginning of tag soup, ...


>
> Don't worry, these tags auto-close when a parent tag is closed.

Two points for clarification:

1. There's the old issue, related to dual parsers, of trying to get
Mozilla family user agents to give proper handling of XHTML+MathML
when served through text/html -- following early Amaya practice. (In
the end the W3C HTML WG refused to support this idea and spawned the
mimetype application/xhtml+xml.) It seems that formally correct
XHTML+MathML would now gain coverage as text/html under current WhatWG
thinking, at least when XML namespaces are evident only through use of
the xmlns attribute (which would be ignored in tag soup), i.e., no use
of xml namespace prefixing. Is this correct?

2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers
will generate MathML content that's good enough for Mozilla rendering?

---

In case you don't know:

The W3C Math group has announced that it is beginning to think seriously
about author-level markup for math.

Long term -- say ten years in the future (we've already been at this
for ten years) -- I think author level math additions to the tag soup
vocabulary would work out much better, especially with enhanced CSS
support.

Cheers.

-- Bill

----------------------------------------------------------------------
William F. Hammond Dept. of Mathematics & Statistics
518-442-4625 The University at Albany
hammond At math.albany.edu Albany, NY 12222 (U.S.A.)
http://www.albany.edu/~hammond/ Dept. FAX: 518-442-4731
----------------------------------------------------------------------

David Carlisle

unread,
Sep 27, 2006, 12:44:42 PM9/27/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
I don't think I saw Ian's original comment, Just Roger's reply?

> What I would be proposing for HTML5 is just the following list of
> elements:
>
> math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom,
> mfenced, menclose, msub, msup, msubsup, munder, mover, munderover,
> mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction

You would beed to include the leaf elements (mi mn mo mtext) otherwise
there'll be no characters in the mathml!, also mspace is pretty
important.

But a more general point I think it's dangerous for a spec to be
profiled by _implementations_. The Math WG activity has just been
restarted at W3C and if there is a need to profile MathMl to
presentation MathML (or a subset thereof) please can it be done _there_
so that there is some chance that mathml authoring tools can be
customised to have options to generate code to match any profiled spec.

> I don't like mlabeledtr very much (I have already expressed my views
> about it to folks of the MathML WG)

Roger, I don't see anything searching for
http://www.w3.org/Search/Mail/Public/search?type-index=www-math&index-type=t&keywords=mlabeledtr&search=Search
I know you've talked to us at conferences etc, but we're all getting old
and if comments aren't on the comment list, then they are likely to get
forgotten over time.

_Now_ would be a really good time to make such comments as we are in the
process of finalising the requirements for what extar features should
be in MathML3, and what if necessary, features should be deprecated.


I don't remember specific discussions about an <mtr label="..."> I
would guess there woul dbe some convern about the label being an
attribute rather than an element restricting the possibilities, but
implementation advice on difficulties on teh current schem woul dbe
taken seriously....

Ian wrote about entities


> Yeah... Do we really need those? Some of them seem reasonable to add, but
> 2000 seems like too many for the mnemonic advantage to beat just using
> Unicode codepoints...

I'd say that it's probably not worth including only a few, it would just
lead to confusion. The problem is that much mathml is generated using
tools and those tools may use entities, and if they do that the user
hasn't much control over which are used, and how to fix things to remove
entities that are not supported in the browser. It would be better to
just get the MathML authoring tools to use characters or character refs
directly and tell the user mathml entities are not supported (but html
ones are)

David

Ian Hickson

unread,
Sep 27, 2006, 2:06:36 PM9/27/06
to William F Hammond, dev-tec...@lists.mozilla.org, dev-tec...@lists.mozilla.org
On Wed, 27 Sep 2006, William F Hammond wrote:
>
> 1. There's the old issue, related to dual parsers, of trying to get
> Mozilla family user agents to give proper handling of XHTML+MathML when
> served through text/html -- following early Amaya practice. (In the end
> the W3C HTML WG refused to support this idea and spawned the mimetype
> application/xhtml+xml.) It seems that formally correct XHTML+MathML
> would now gain coverage as text/html under current WhatWG thinking, at
> least when XML namespaces are evident only through use of the xmlns
> attribute (which would be ignored in tag soup), i.e., no use of xml
> namespace prefixing. Is this correct?

I'm confused by your terminology.

MathML using namespaces and XML syntax would not, under the WHATWG
proposals here, be formally correct. XML sent as text/html is never
correct per the "WHATWG thinking".

What is being proposed here is a non-XML syntax, to be formally described
in the HTML5 specification, which, went processed by an HTML5 UA, would
generate a DOM that can then be processed per the MathML2 specification.

Per the WHATWG specifications, the presence of an "xmlns" attribute is
always a conformance error in any content sent as text/html.


> 2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers
> will generate MathML content that's good enough for Mozilla rendering?

The idea being entertained is that off-the-cuff HTML5 authors, and HTML5
editors, would create content which, when processed by an HTML5 UA (such
as Mozilla, in due course), would render as MathML markup would.


> The W3C Math group has announced that it is beginning to think seriously
> about author-level markup for math.
>
> Long term -- say ten years in the future (we've already been at this for
> ten years) -- I think author level math additions to the tag soup
> vocabulary would work out much better, especially with enhanced CSS
> support.

On the very short term, the proposal here is just a proof of concept. On
the medium term (12 months) I was considering specifying more complex
parsing rules for MathML such that the same MathML2-compatible DOM could
be obtained from much smaller markup, e.g. by implying <mo> tags around
operators and <mn> tags around numbers.

HTH,

Roger B. Sidje

unread,
Sep 28, 2006, 4:52:26 AM9/28/06
to David Carlisle, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
On 28/09/2006 2:44 AM, David Carlisle wrote:

> I don't remember specific discussions about an <mtr label="..."> I
> would guess there woul dbe some convern about the label being an
> attribute rather than an element restricting the possibilities, but
> implementation advice on difficulties on teh current schem woul dbe
> taken seriously....

Here is an informative thread about it:
http://groups.google.com/group/netscape.public.mozilla.mathml/browse_thread/thread/d77d015a1fffc6fb/5b0eb0cc9724ce72
(not on www-math, though. Maybe I should forward it there?)

It appeared that attributes (like those in <mfenced>) aren't unanimous
either. But having a bloated tag that won't be implemented in the next
several years isn't really helpful.

> Ian wrote about entities
>
>>Yeah... Do we really need those? Some of them seem reasonable to add, but
>>2000 seems like too many for the mnemonic advantage to beat just using
>>Unicode codepoints...
>
> I'd say that it's probably not worth including only a few, it would just
> lead to confusion.

I am actually a fan of entities because they improve readability a fair
bit. I hope Ian won't give up thinking on this issue so quickly...
especially in the context of MathML where strange characters are quite
common.

As to my suggestion that "if [a document] is strict then maybe entities

could be required to have a semi-colon -- which will then avoid the

ambiguities", to which Ian responded that, "That would break back-compat."

We have other cases of broken back-compat. -- where users were told to
use a non-strict DOCTYPE or some other workaround, e.g, line-height of
images.
---
RBS

David Carlisle

unread,
Sep 28, 2006, 5:24:59 AM9/28/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org

Roger,
Thanks for the link on <mtr label="mylabel">,

> It appeared that attributes (like those in <mfenced>) aren't unanimous
> either.

yes mfenced also "suffers" from requiring attributes, but probably one
is more likely to need markup in an equation label than in a stretchy
operator. It's not so uncommon to want superscript * or daggers etc to
highlight special versions of formulae, and mfenced is explictly a
shorthand form so you can always use the mwrow/mo form if you need an
operator that is "decorated" in some way. That would not be the case
here if mlabeledtr were deprecated and an attribute form was the
only version. (Actually it would if the attribute could then be
css-styled using css generated content. Allowing css (or other
mechanism) auto numbering is I think a highly requested feature for
mathml3.


> (not on www-math, though. Maybe I should forward it there?)

Yes please do. When we are doing a pass for errata or pulling in feature
requests for a new version we can do a more or less exhaustive check of
the official comment list but (even with google's help) doing an
exhaustive check of the entire web's a bit hard:-)


The charter for the current working group

http://www.w3.org/Math/Documents/Charter2006.html

has as one of its headline work items

Extension of MathML with enhanced support for equation labeling,
including automatic numbering, general label placement and style, and
resolution of references.

so getting that specified out in a way that ensures that implementations
can implement it sounds like a good idea, and the timiming is good now
to get new features in this area if that is needed. If WhatWG members
are interested in mathml most of them are w3c members and could join the
WG of course (currently only Opera is represented out of the main
browser vendors) But WG membership isn't really needed we can do the
technical discussion on the public www-math list if that is appropriate.

> I am actually a fan of entities because they improve readability a fair
> bit.

Well as you know I've invested a frightening number of houres maintaining
that entity set (and the draft iso set at www.w3.org/2003/entities,
which is the same thing, really) so I'm also think they are valuable,
although it's a kind of love-hate relationship most of the time:-)

> I hope Ian won't give up thinking on this issue so quickly...
> especially in the context of MathML where strange characters are quite
> common.

Yes I think the ideal situation is that they all be allowed. My comment
was that subsetting them is likely to be more confusing than helpful.

> As to my suggestion that "if [a document] is strict then maybe entities
> could be required to have a semi-colon -- which will then avoid the
> ambiguities", to which Ian responded that, "That would break back-compat."

Requiring a ; would seem reasonable to me (ie make the lack of a ; make
the & into an implict &amp; rather than be an error as in xml).
That does have a theoretical backward compatibility problem in that
&rightarrow; would be an arrow instead of &amp;rightarrow; but I would
have thought that the occurrences of any such construction outside of
test suites was rather rare.

David

White Lynx

unread,
Sep 28, 2006, 8:38:50 AM9/28/06
to
I consider switching from XML to text/html as inappropriate and
pointless development, morover it is damaging in long term perspective.


First of all it is unclear where this idea comes from, as MathML
community has no legacy text/html content that one should care about.
All MathML content is wellformed (by definition), which means that one
has less errors in MathML documents comparing to what one would have in
tagsoup approach, it also means that all MathML content can can be
handled with XML tools, can be processed with XSLT, matched using
XPath, mixed with other XML based markup languages (OpenMath, SVG) etc.
There is no single MathML implementation that supports text/html
tagsoup, but does not support X(HT)ML, while inverse is not true, there
are XML only MathML implementations that by definition have nothing to
do with HTML legacy.

Further it is not clear for me why this has to be done today, after
paying price for wellformedness and tackling XML related problems for
seven years, when finally MSIE/MathPlayer accept application/xhtml+xml
and thus allow people to deliver the same XHTML+MathML to
MSIE/MathPlayer and Mozilla (one can add Opera with UserJS) someone
decides to revert (more precisely convert) everything to tagsoup.

Profiling policy is sounds unclear and strange to me. Solving issue on
the level "I'm happy to drop/add any tag to this list. Just give me the
list you want" or based on MathML support level on some particular
implementation seems to be irresponsible.
There are at least two subgroups in W3C Math WG that one could drop a
message with profile proposal to after looking at "wrong table".
One is called liason with WhatWG subgroup and as name suggests is
expected to ensure that needs of MathML are addressed in WhatWG specs.
Another is liason with CSS subgroup, which is expected to define MathML
profile suitable for usage in XML+CSS framework and a few CSS
extensions needed to format proposed MathML profile.
There is also subgroup that deals with compound document formats. My
opinion is that profiling of MathML should be coordinated with these
units as irresponsible steps may spoil W3C efforts in the same area.

One more thing that sounds unlogical and rather strange is that
Mozilla/WhatWG try to move MathML further from XML+CSS framework,
by converting XML to tagsoup with ad hoc parsing rules and embracing
constructions like mstyle, mpadded in "proposed" profile.

Message has been deleted

Ian Hickson

unread,
Sep 28, 2006, 2:45:36 PM9/28/06
to Roger B. Sidje, dev-tec...@lists.mozilla.org, David Carlisle, dev-tec...@lists.mozilla.org
On Thu, 28 Sep 2006, Roger B. Sidje wrote:
> >
> > Ian wrote about entities
> >
> > > Yeah... Do we really need those? Some of them seem reasonable to add, but
> > > 2000 seems like too many for the mnemonic advantage to beat just using
> > > Unicode codepoints...
> >
> > I'd say that it's probably not worth including only a few, it would just
> > lead to confusion.
>
> I am actually a fan of entities because they improve readability a fair
> bit. I hope Ian won't give up thinking on this issue so quickly...
> especially in the context of MathML where strange characters are quite
> common.

I really don't want to start introducing weird rules for parsing entities
(I'm trying to simplify the entity parsing rules, not make them worse). At
least not at this stage. Maybe once we have a proof-of-concept working, it
would make more sense to revisit the issue, but I'd want to do a thorough
scan of the Web to see how common these entities actually are today.


> As to my suggestion that "if [a document] is strict then maybe entities
> could be required to have a semi-colon -- which will then avoid the
> ambiguities", to which Ian responded that, "That would break
> back-compat."
>

> We have other cases of broken back-compat. -- where users were told to
> use a non-strict DOCTYPE or some other workaround, e.g, line-height of
> images.

Yeah. And we can see how well _that_ went. QA nightmare, multiple
overlapping codepaths, obscure bugs, confused authors, contradicting
documentation, etc. Let's not go there again. The whole point of
MathML-in-HTML is to have back-compat work -- if we didn't care about
back-compat, we would just have people use MathML-in-XHTML.

Roger B. Sidje

unread,
Sep 28, 2006, 9:45:46 PM9/28/06
to David Carlisle, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org
On 28/09/2006 7:24 PM, David Carlisle wrote:

> Roger,
> Thanks for the link on <mtr label="mylabel">,
>
>
>>It appeared that attributes (like those in <mfenced>) aren't unanimous
>>either.
>
>
> yes mfenced also "suffers" from requiring attributes, but probably one
> is more likely to need markup in an equation label than in a stretchy
> operator. It's not so uncommon to want superscript * or daggers etc to
> highlight special versions of formulae, and mfenced is explictly a
> shorthand form so you can always use the mwrow/mo form if you need an
> operator that is "decorated" in some way. That would not be the case
> here if mlabeledtr were deprecated and an attribute form was the
> only version. (Actually it would if the attribute could then be
> css-styled using css generated content. Allowing css (or other
> mechanism) auto numbering is I think a highly requested feature for
> mathml3.

The danger (and problem) with that tag is that it is over-designed to
accommodate the tiny set of special-cases you alluded to, while holding
the 99.99% majority of cases hostage. One could put up with CDATA all
the way, e.g., (6') or (7*), (8&dagger;), (9a), etc -- if a subequation
is really needed. I would think we can put with this and reap the
benefits. A <mtr label="mylabel"> tag that stands a chance, degrades
gracefully, *free* cross-referencing (with href#mylabel -- by just
invoking what the browser already does with <a name="...">), the
counters that you mentioned (which work in Gecko today, BTW), etc.
(Also conceivable, optimistically, is a pseudo-class :label to style the
label text, but we might going ahead of ourselves...)

Seems to me that the concrete benefits that might result outweigh the
feeling against an attribute.

>
>>(not on www-math, though. Maybe I should forward it there?)
>
> Yes please do.

OK.

> Well as you know I've invested a frightening number of houres maintaining
> that entity set (and the draft iso set at www.w3.org/2003/entities,
> which is the same thing, really) so I'm also think they are valuable,
> although it's a kind of love-hate relationship most of the time:-)

Yeah. Let's hope Ian is listening and keeps these entities on his radar...
---
RBS

Juan R.

unread,
Sep 29, 2006, 4:04:20 AM9/29/06
to
Ian Hickson wrote:
>
> I'm happy to drop/add any tag to this list. Just give me the list you
> want.
>

Ok, this is one in LISP syntax for lists: ()

>
> --
> Ian Hickson U+1047E )\._.,--....,'``. fL
> http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
> Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

No need to reply the rest you are promoting, since basically you may
think -parodying you- that MathML in HTML 5 is <anything sent as
text/html>

Far from simplifying the authoring of mathematical docs and spreading
online maths, you are really doing comunication more difficult still
for all of us with this strange hibrid convincing nobody.


Juan R.

Center for CANONICAL |SCIENCE)

David Carlisle

unread,
Sep 29, 2006, 4:40:40 AM9/29/06
to r...@maths.uq.edu.au, dev-tec...@lists.mozilla.org, i...@hixie.ch, dev-tec...@lists.mozilla.org

> Seems to me that the concrete benefits that might result outweigh the
> feeling against an attribute.

Which is why it's good to get real implementation experience into the
language design (or update). Either by implementors joining the WG or
by doing the technical design on the public www-math list so you and
others can join in (or both).

David


sha...@shantirao.com

unread,
Oct 2, 2006, 12:13:04 AM10/2/06
to
I would like to call attention to RBS's original point: MathML is
emperiled, and something *can* be done. To expound:

1. MathML is nifty. It's the best thing since LaTeX. In fact, it's the
only thing since LaTeX. My colleagues admire my Mozilla-rendered
documents while they struggle with MS Word.

2. MathML is in trouble. My colleagues who use IE can't see my
equations. This makes it unacceptable for me to write anything important
in MathML, so long as I want to succeed at my job. So investing effort
into learning or using MathML is a quixotic proposition.

3. XHTML is the web language of the future -- and it always well be. It
might as well be dead. It was born crippled, and it never will catch on,
for the simple economic reason that HTML is easier to use. XHTML was
supposed to be the replacement of HTML. In fact, it was so popular that
we're moving forward with HTML5.

4. Languages that are not easy to write are ignored. The wasteland of
obsolete internet standards is littered with romatic, intellectually
superior, morally defensible languages like XFORMS, VRML, and SVG. Boy,
those sure made our lives better! Compare those to what actually gets
used: unvalidated HTML, CSS, JavaScript, and the DOM. All marginally
self-consistent languages that are easy to write and tolerant of abuse.

5. If MathML is not widely understood and easily used by browsers, say
by being a part of HTML5, then sites that drive technology adoption,
like Wikipedia, will have no incentive to switch from the current
TeX->PNG kludge. Lacking a large user base, MathML will not grow.

6. Although the MathML community is self-contained today, we all know
what happens to species that evolve on islands: they get smaller and
prone to extinction. The community needs to grow, and incorporation in
HTML5 is something we should all get behind.

Shanti

* Camel = a horse designed by committee

Chris Chiasson

unread,
Oct 2, 2006, 1:10:45 AM10/2/06
to
The reason MathML is in trouble is because Microsoft hasn't implemented
it (and many other good XML technologies) natively into Internet
Explorer. They are using their intertia to screw over the open
standards. It's hard for Firefox to compete in a (MathML) market that
doesn't exist.

The best thing that could be done without MS help is to make XML
handling plugins as ubiquitous and easy to install as Adobe's
Macromedia Flash plugin.

Maybe it would be prudent to make an open source MathML and SVG plugin
for IE so people don't have to rely on the changing winds of corporate
desires and licensing. You could call it something like Firefox
sub-rendering for IE - or whatever.

White Lynx

unread,
Oct 2, 2006, 5:33:30 AM10/2/06
to
> XHTML is the web language of the future -- and it always well be. It
> might as well be dead.

MathML is not necessary confined to XHTML, it may use other XML
application as host languages. In particular one can name several XML
applications that are much more suitable for encoding scientific
articles then XHTML (NIH Journal Publishing DTD, DocBook, TEI). Of
course XHTML will remain to be the most widespread host language for
MathML, but it is not something that MathML absolutely depends on. And
XML in general is apparently not dead, it is enough for MSIE to fix
their broken parser and the people that yesterday argued that we all
must switch to XHTML, today argue that HTML5 is the only way to go,
tomorrow may adjust their opinion once more. It should not be a
problem.

> Languages that are not easy to write are ignored

Well compare XML
<mmultiscripts><mi>A</mi><mprescripts/><none/><mi>B</mi></mmultiscripts>
and HTML
<mmultiscripts><mrow>A</mrow><mprescripts></mprescripts><none></none><mrow>B</mrow></mmultiscripts>
that being processed by parser will generate mi-mo-mu tagsoup
automatically
<mmultiscripts><mrow><mi>A</mi></mrow><mprescripts></mprescripts><none></none><mrow><mi>B</mi></mrow></mmultiscripts>
So how switching to HTML helped to make language human processable?

I am definetely for turning MathML into human processable language, and
removing mi-mo-mu (explicit markup is useful for stuff like integrals,
N-ary operators, delimiters, but otherwise it is just bloat
<mn>2</mn><mo>+</mo><mn>2</mn><mo>=</mo><mn>4</mn>), however this can
be done and should be done whithin XML, without introducing telephatic
parsing rules. If mi-mo-mu are not available in original source and are
generated by parser then their semantic value is exactly zero (and yes
I know that it is close to zero in any case). ECMA approach is one
possible way to remove mi-mo-mu and add use something like <nary> (but
not exactly <nary> construction which is the most CSS unfriendly part
of ECMA math markup) for operators and just <i> for italic.
So we should either remove it from MathML (the problem however is lack
of consensus in WG on issue) or keep it. Removing it from source but
keeping in DOM does not make any sense, as you remove semantics but
keep this stuff in DOM.

> Although the MathML community is self-contained today, we all know
> what happens to species that evolve on islands: they get smaller and
> prone to extinction.

Integration with environment in which formulae are embedded is crucial
for any mathematical markup. All other approaches are closesly
integrated in some extensible framework with powerful formatting
mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).
Extensibility and availablility of fullfeatured style language or
equivalent formatting mechanism are crusial here. In case of MathML
environment is web, so integration of MathML into extensible framework
is integration into XML+CSS+DOM which is on agenda of Math WG. In
contrast HTML5 does not give us extensible framework and ad hoc parsing
rules does not help us to integrate MathML with CSS while keepind DOM
synchronised with actual markup.

William F Hammond

unread,
Oct 2, 2006, 11:23:32 AM10/2/06
to dev-tec...@lists.mozilla.org
"Chris Chiasson" <chris.c...@gmail.com> writes:

> ...


> The best thing that could be done without MS help is to make XML
> handling plugins as ubiquitous and easy to install as Adobe's
> Macromedia Flash plugin.

I acquired a new machine with MS Windows XP (Home) recently and found
that it had both IE and AOL/NetScape visible as desktop icons. Of
course, NetScape rendered XHTML+MathML.

Installing the Design Science plugin for IE called MathPlayer was
quite easy, but one does need to know where to go to get it. So the
math community might consider advertising its location -- or at least
advising its readers to google for "MathPlayer".

> Maybe it would be prudent to make an open source MathML and SVG plugin
> for IE so people don't have to rely on the changing winds of corporate
> desires and licensing. You could call it something like Firefox
> sub-rendering for IE - or whatever.

What was new for me about the OEM NetScape was that it would render in
IE mode if asked.

It's curious that an OEM platform should include NetScape along with
IE but not provide a seamless plugin for IE. Perhaps Microsoft will
want to rethink that.

-- Bill

Paul Topping

unread,
Oct 2, 2006, 12:16:39 PM10/2/06
to William F Hammond, dev-tec...@lists.mozilla.org
Hi,

Thanks Bill (Hammond) for mentioning our MathPlayer plugin. While I
understand that people might want IE to support MathML "out of the box",
many capabilities in many apps are provided as plugins. I don't think it
is right to think that all plugins are bad. Plugins allow company's like
mine, with an interest in providing technology in a particular area, to
move technology forward independently of monsters like Microsoft. In
other words, if Microsoft provided MathML support in IE, it wouldn't be
as good as MathPlayer and everyone would be complaining about that.

Of course, demanding that Microsoft support XHTML in IE is perfectly
reasonable. IE does a really good job, IMHO, of allowing plugins like
MathPlayer support embedded XML languages, except in HTML, not XHTML.
MathPlayer works around this by allowing carefully prepared XHTML+MathML
to work in IE but proper support for XHTML in IE would be better.

Paul Topping
President & CEO

Design Science, Inc.
"How Science Communicates"
Makers of MathType, MathFlow, WebEQ, MathPlayer, Equation Editor,
TeXaide
http://www.dessci.com

> _______________________________________________
> dev-tech-mathml mailing list
> dev-tec...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-mathml
>

Chris Chiasson

unread,
Oct 2, 2006, 2:31:40 PM10/2/06
to
White Lynx wrote:
>All other approaches are closesly
> integrated in some extensible framework with powerful formatting
> mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).

I don't know how common the knowledge is, but MathML is closely tied
with a certain platform: Mathematica

Wolfram Research (makers of Mathematica) was one of the originators of
MathML. Anyway, present day MathML is strongly related to Mathematica's
internal representation of math, as shown in this short example.

Consider Euler's formula as entered in the most source-code like syntax
available in Mathematica (called InputForm):

E^(I*x)==Cos[x]+I*Sin[x]

After parsing, it becoms this (called FullForm):

Equal[Power[E,Times[Complex[0,1],x]],Plus[Cos[x],Times[Complex[0,1],Sin[x]]]]

Compare this with content MathML

<math
xmlns='http://www.w3.org/1998/Math/MathML'><apply><eq/><apply><power/><
exponentiale/><apply><times/><imaginaryi/><ci>x</ci></apply></apply><apply><
plus/><apply><cos/><ci>x</ci></apply><apply><times/><imaginaryi/><apply><sin/>
<ci>x</ci></apply></apply></apply></apply></math>

Notice how <apply> is used to capture the structure of
head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.

However, when an equation like this is typeset in Mathematica, it is
converted to a box structure. I'll use StandardForm boxes for this
example:

RowBox[{SuperscriptBox["\[ExponentialE]",RowBox[{"\[ImaginaryI]","
","x"}]],"\[Equal]",RowBox[{RowBox[{"Cos","[","x","]"}],"+",RowBox[{"\[ImaginaryI]","
",RowBox[{"Sin","[","x","]"}]}]}]}]

Note that RowBox means that the items within should have the same
baseline. Obviously, the possible necessity to linebreak complicates
things somewhat. However, the box structure remains invariant (which is
why I think it's odd that Firefox doesn't linebreak <mrow>).

Compare this with presentation MathML:

<math
xmlns='http://www.w3.org/1998/Math/MathML'><mrow><msup><mi>&#8519;</mi>
<mrow><mi>&#8520;</mi><mo>&#8290;</mo><mi>x</mi></mrow></msup><mo>&#63449;</
mo><mrow><mrow><mi>cos</mi><mo>&#8289;</mo><mo>(</mo><mi>x</mi><mo>)</mo></
mrow><mo>+</mo><mrow><mi>&#8520;</mi><mo>&#8290;</mo><mrow><mi>sin</mi><mo>&#
8289;</mo><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></mrow></mrow></math>

So those plentiful <mrow> elements shouldn't be unexpected. Also, it
becomes pretty apparent why presentation MathML is nearly
incomprehensible. It is a representation of an already verbose two
dimensional box formatting system in XML, making it even more verbose.

Of course, the fact that presentation MathML is a translation of a box
formatting system makes it well suited to styling by CSS.

Mathematica's box formatting subsystem (called the FrontEnd)
understands very few operators (input shortcuts). One of the few is
Rule (lhs->rhs). It doesn't even understand Plus (l+m+r). It certainly
wouldn't understand what's going on if someone left out a RowBox.

In that respect, presentation MathML is slightly more flexible, because
it can "insert" some implicit row boxes when the markup wouldn't make
sense otherwise.

Anyway, I don't speak for WRI, but I think it's fairly obvious they
will try to keep MathML "in their image" so that it will be easy for
them to have an XML language for math that is understood by machines
... aka their computer algebra system.

IMHE (in my humble estimation) Firefox people would be better off
trying to define "shorthand" definitions for the content MathML system,
which WRI will be less likely to oppose.

White Lynx wrote:
> Extensibility and availablility of fullfeatured style language or
> equivalent formatting mechanism are crusial here.

Agreed. I think it would be imprudent to remove formatting structures
from presentation MathML because that would make it harder to write
appropriate CSS.

Paul Topping

unread,
Oct 2, 2006, 3:00:07 PM10/2/06
to Chris Chiasson, dev-tec...@lists.mozilla.org
Chris,

While Mathematica people were heavily involved in MathML's creation, it
is hardly the result of their effort alone. They provided some much
needed early impetus and hosted two MathML conferences but since then
they have been more noticeable by their absence from the MathML
community. At any rate, the notion that they have some kind of control
over it now is just not even close to being the case.

If anyone has opinions on how MathML can be improved, they should
participate in the W3C's MathML 3.0 effort just getting underway. Then
they can see for themselves that Wolfram/Mathematica doesn't run the
show. Actually, I half expected someone to accuse my company, Design
Science, of that these days.

I would encourage anyone to create front ends that save as MathML.
Either GUI ones like our products or "programming" languages that are
converted into MathML. Now that MathML has been fairly well established
as the XML representation for math, ease of conversion should be a goal
for any front end. However, IMHO ease of use should take priority over
this.

Paul Topping
Design Science, Inc.
www.dessci.com

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf
> Of Chris Chiasson
> Sent: Monday, October 02, 2006 11:32 AM
> To: dev-tec...@lists.mozilla.org
> Subject: Re: MathML-in-HTML5
>

Jacques Distler

unread,
Oct 2, 2006, 10:44:35 PM10/2/06
to
In article
<mailman.6128.115938040...@lists.mozilla.org>, Ian
Hickson <i...@hixie.ch> wrote:


>What is being proposed here is a non-XML syntax, to be formally described
>in the HTML5 specification, which, went processed by an HTML5 UA, would
>generate a DOM that can then be processed per the MathML2 specification.
>

> ...


>
>On the very short term, the proposal here is just a proof of concept. On
>the medium term (12 months) I was considering specifying more complex
>parsing rules for MathML such that the same MathML2-compatible DOM could
>be obtained from much smaller markup, e.g. by implying <mo> tags around
>operators and <mn> tags around numbers.

Please don't go down that road.

Let's not have two incompatible markup languages, both called "MathML,"
one of which can be embedded in HTML5, the other in XHTML.

If you want MathML-in-HTML5, create a profile (along the lines of
XHTML's Appendix C) of MathML 2.0 that is safe to consume by the
Tag-Soup parser.

--
PGP public key: http://golem.ph.utexas.edu/~distler/distler.asc

Roger B. Sidje

unread,
Oct 3, 2006, 1:48:17 AM10/3/06
to White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
On 28/09/2006 10:44 PM, White Lynx wrote:

> I consider switching from XML to text/html as inappropriate and
> pointless development, morover it is damaging in long term perspective.

Damaging to what? To MathML? Not really in my opinion. What damage could
there be to have plenty of MathML formulas on the web?!? But to the
XML/XHTML agenda, possibly. And that has been the real "problem" since
the beginning, and which I alluded to in my opening post. It wasn't a
fight fitted for a niche MathML that was already struggling to make a
name for itself.

Interested in using MathML? First pass that XHTML barrier, and that
wasn't even a small barrier. It was a significant barrier, taking seven
years before IE understood application/xhtml+xml. As for the fact that

"the people that yesterday argued that we all must switch to XHTML,

today argue that HTML5 is the only way to go". Speaking generally (or
specifically w.r.t. MathML)? People had to switch to XHTML to get MathML
-- it wasn't even a matter of choice. C.f. again this very insightful
post on the matter.
http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d58c35217afcb54?dmode=source

So after all these years making the case for something else (XHTML),
what this thread is about is to make <math>...</math> works everywhere,
especially where it still matters the most today, and that is HTML5. As
I indicated, my original take is for <math>...</math> to work as-is --
as we have come to know and enjoy it. But it is obvious that this new
mixing has to be defined somehow, even if we later come to a conclusion
saying that it is an opaque <object>, or a profile of some sort.

But I hope that as further insight is gathered through the
proof-of-concept, it turns out that <math>...</math> is just fine, and
that interoperability issues won't be thrown at an already special niche
technology. While on this, I should stress that tag-soug is possible
anywhere, although this is often not mentioned because the extent is
much different. Well-formed tag-soup (as odd as it sounds...) is
possible, which is why these reddish "invalid-markup" messages sometimes
pop in Gecko's MathML rendering. Such things are left undefined by the
spec. However, in the case of MathML where the markup is generated
automatically by software, there is no particular reason to believe that
these generators will suddenly start to generate an indigestible
tag-soup. So it is not quite realistic to over-emphasize this issue.

MathML already works in XML/XHTML and this proposal is not going to
break that. But there is little else to gain there (as far as MathML in
concerned). Publishers who use XML in their back-end production line can
continue to do what they have been doing.

However, MathML stands to win more (especially individual users) in the
front-end by being in HTML (HTML5 for that matter). This might also
encourage those building HTML authoring tools to consider interfacing
MathML (either with free or commercial plug-ins) because the XML/XHTML
barrier won't be standing right at their face. (On the issue of the
verbosity of MathML, this wouldn't be much of an issue if people didn't
have to stare at the MathML. In fact, when I look at HTML+Javascript+CSS
pages these days, they are also quite cryptic... It is possible to have
invisible/collapsible MathML in an editor interfaced to a plug-in?
Surely for people who have experience building comprehensive editors.
But with the XHTML barrier they can't even chime in...)

I am sure by now that it should be evident that it is XML/XHTML that
stand to lose with MathML enabled in HTML5. Anyway, XHTML doesn't seem
to be going anywhere. (How often does one stumble on a page served as
application/xhtml+xml -- if it isn't a page with MathML?) In any case,
as I indicated, it will still work there, maybe not just as _the_
selling argument that it is now. (Many math pages wouldn't have bothered
with XHTML if it had been possible to have MathML in HTML, and that's
where their loss might come from. But does it really matter? Read Robert
Miner's earlier post again.)

To advance MathML, we contributed a great deal to XML/XHTML and pushed
for them so much that it is very easy to forget the initial focus.
MathML-in-HTML5? Worth a try. The thread is now about the issues in
prototyping this, and the benefits (or otherwise) for MathML and math on
the web. And I must say I don't see that much disadvantages in enabling
MathML everywhere at this point.
---
RBS

Paul Topping

unread,
Oct 3, 2006, 2:27:42 AM10/3/06
to Roger B. Sidje, White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
This all sounds vaguely familiar. When MathML (and Mozilla) were new,
many of us argued for MathML support in Mozilla's HTML parser for many
of the same reasons I see here. We were told by the Mozilla chieftains
that this would only happen over their dead bodies and that XHTML was
the only way we were going to get MathML support. Perhaps it did take us
7 years to get IE to work with a XHTML+MathML but IE has also had a
solution for MathML embedded in HTML for even longer.

While Microsoft may have (nasty) business reasons for not supporting
XHTML, they may also have made the argument that the world wasn't ready
to change all their pages into XHTML just for some gain in "purity".
Sounds like some people on this list are coming around to that same
point of view.

So, as I posted a week ago, why not adopt the Microsoft convention for
embedding MathML (or any other XML language) in HTML? Minus the COM
class id stuff, of course. Basically, this would result in a simple
declaration of the embedded language's namespace. For the reasons stated
earlier, just <math> is not enough. At a minimum, it doesn't allow for
smooth transitions to new versions of MathML. Come on, Microsoft isn't
wrong all the time.

Paul Topping
Design Science

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf

White Lynx

unread,
Oct 3, 2006, 3:18:43 AM10/3/06
to
> Please don't go down that road.
> Let's not have two incompatible markup languages, both called "MathML,"
> one of which can be embedded in HTML5, the other in XHTML.

Completely agree. Personally I am not against removing mandatory tokens
and following approach taken by ECMA (this attitude does not
necessarily reflect the position of Math WG however), but I am
radically against current approach. It does not make sense to remove
tokens from markup while preserving them in DOM (the semantic value of
tokens automatically generated by parser is zero, and not all
conversion/interchange tools operate through DOM).

> I don't know how common the knowledge is, but MathML is closely tied
> with a certain platform: Mathematica

They just use MathML for import/export of math formulae. This is not
the kind of integration I meant.

>> I consider switching from XML to text/html as inappropriate and
>> pointless development, morover it is damaging in long term perspective.

> Damaging to what? To MathML? Not really in my opinion. What damage could
> there be to have plenty of MathML formulas on the web?!?

What prevents you from having plenty of formulae on web today? Do we
have at least one MathML implementation that supports HTML, but lacks
XHTML support? Do we have MathML implementations that support XHTML
only? So, how introducing two different and incompatible parsing rules
will improve interoperability? And assume that you have plenty of
formulae on web and you want to process them. How having half of
them in tagsoup and another half in XML does not make them easier to
handle?

> But to the
> XML/XHTML agenda, possibly. And that has been the real "problem" since
> the beginning, and which I alluded to in my opening post.

It is not the beggining. Seven years passed since that time and a lot
of XML applications emerged since then. Most of current W3C are
designed keeping in mind XML and not SGML or HTML. MathML is part of
large and extensible framework where it can be combined with other XML
applications. Current proposal does adds no new functionality to
MathML, but rather artificially splits MathML community into
incompatible parts that has to be delt separately.

> Interested in using MathML? First pass that XHTML barrier, and that
> wasn't even a small barrier. It was a significant barrier, taking seven
> years before IE understood application/xhtml+xml.

It was. But it is not anymore. So it is not clear what are you struggle
with. Maybe someone has to struggle with legacy text/html content, but
it is not our problem we have no MathML in HTML legacy. Maybe someone
complaints that MSIE does not support application/xhtml+xml, again it
is not our problem as without MathPlayer MSIE can not process MathML
while with MathPlayer application/xhtml+xml problem is N/A.
If someone doubts about future of XML in MSIE, note that Microsoft's
own mathematical markup language is (and most of other recent format$
are) entirely XML based.

> MathML already works in XML/XHTML and this proposal is not going to
> break that.

XML for maths means better interoperability (and extensibility) this
proposal splits MathML into two different versions

> This might also
> encourage those building HTML authoring tools to consider interfacing
> MathML (either with free or commercial plug-ins) because the XML/XHTML
> barrier won't be standing right at their face.

Once again there is no barrier, XHTML has all the functionality that
HTML has and much more. The only issue is MSIE parser and as noted
above several times this issue is N/A to MathML today.

> Many math pages wouldn't have bothered
> with XHTML if it had been possible to have MathML in HTML

Which means that goint in that direction will give rise to two
different versions of MathML, damaging interoperability and introducing
no new functionality.

> MathML-in-HTML5? Worth a try.

Once you try something you can't always untry it. Just proceed with you
proposal and we will have to strugle with text/html legacy forever.

Jacques Distler

unread,
Oct 3, 2006, 9:17:47 AM10/3/06
to
In article <1159859923....@m73g2000cwd.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

>It does not make sense to remove
>tokens from markup while preserving them in DOM (the semantic value of
>tokens automatically generated by parser is zero, and not all
>conversion/interchange tools operate through DOM).

HTML does this all the time. (E.g. inferred <tbody> element as a child
of <table>, inferred <head> and <body> elements,...) There's nothing
wrong with inferred elements ... per se.

The only problem occurs when people expect their MathML code (or, more
pertinently, the software they use to generate it) to be interoperable
in an XML context.

>> This might also
>> encourage those building HTML authoring tools to consider interfacing
>> MathML (either with free or commercial plug-ins) because the XML/XHTML
>> barrier won't be standing right at their face.
>
>Once again there is no barrier, XHTML has all the functionality that
>HTML has and much more.

I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
produce XHTML are rare to nonexistent.

And many users don't have control over the MIME-type their pages are
sent with. If they did, you wouldn't have so many RSS feed sent as
text/XML (which, unless they are plain ASCII, means they are
automatically ill-formed).

>The only issue is MSIE parser and as noted
>above several times this issue is N/A to MathML today.

Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag
soup *TODAY*.

There's every incentive to have the Mozilla people experiment with
allowing Mozilla to do the same.

White Lynx

unread,
Oct 3, 2006, 10:55:21 AM10/3/06
to
> >It does not make sense to remove
> >tokens from markup while preserving them in DOM (the semantic value
of
> >tokens automatically generated by parser is zero, and not all
> >conversion/interchange tools operate through DOM).
>
> HTML does this all the time. (E.g. inferred <tbody> element as a
child
> of <table>, inferred <head> and <body> elements,...) There's nothing

> wrong with inferred elements ... per se.

One thing when you can unamboguously infer completely useless element
that has no semantic value and just groups rows (tbody) and another
thing is when you infer out of nowhere elements with either predefined
presentation or semantic like address, or i.

> The only problem occurs when people expect their MathML code (or,
more
> pertinently, the software they use to generate it) to be
interoperable
> in an XML context.
>
> >> This might also
> >> encourage those building HTML authoring tools to consider
interfacing
> >> MathML (either with free or commercial plug-ins) because the
XML/XHTML
> >> barrier won't be standing right at their face.
>
> >Once again there is no barrier, XHTML has all the functionality
that
> >HTML has and much more.
>
> I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> produce XHTML are rare to nonexistent.

You tend to turn simple things into rocket science.

> And many users don't have control over the MIME-type their pages are

> sent with. If they did, you wouldn't have so many RSS feed sent as
> text/XML (which, unless they are plain ASCII, means they are
> automatically ill-formed).

Browsers follow Appendix F.2 of XML recommendation (if an XML entity
is in a file, the Byte-Order Mark and encoding declaration are used (if
present) to determine the character encoding) not RFC 3023.

>
> >The only issue is MSIE parser and as noted
> >above several times this issue is N/A to MathML today.
>
> Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag

> soup *TODAY*.

Well, MSIE does not deal with MathML in any form and I am not against
embededing MathML in environments other then XML (you can embed it in
LaTeX if you want) but I am against turning it into tagsoup which is
different issue.

White Lynx

unread,
Oct 3, 2006, 11:01:06 AM10/3/06
to
> There's every incentive to have the Mozilla people experiment with
> allowing Mozilla to do the same.

Nobody is against experiments here, just please use your own unique
name (MathML is in use already), your own namespace (if applicable) and
your own content type (that is in case if experiment goes beyound
boundaries of given content type).
Or alternatively make any changes in markup language through channels
provided by organization that developed this markup, defined relevant
namespace and registered the content type.

Jacques Distler

unread,
Oct 3, 2006, 11:25:03 AM10/3/06
to
In article <1159887321....@c28g2000cwb.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

>> HTML does this all the time. (E.g. inferred <tbody> element as a child
>> of <table>, inferred <head> and <body> elements,...) There's nothing
>> wrong with inferred elements ... per se.
>
> One thing when you can unamboguously infer completely useless element
>that has no semantic value

If you can unambiguously infer an element, it matters not a *whit*
whether it is "useful" or "useless."

I could make the same argument about inferred end tags in HTML.
(Inferred elements are just a special case, where both the start and
end tags are optional.)

> > I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> > produce XHTML are rare to nonexistent.
>
> You tend to turn simple things into rocket science.

Writing a CMS that reliably produces well-formed XHTML is "simple"?

You should write one then. The world will thank you.

> > Precisely. MathPlayer2 allows IE/6 to consume MathML embedded in tag
>
> > soup *TODAY*.
>
>Well, MSIE does not deal with MathML in any form and I am not against
>embededing MathML in environments other then XML (you can embed it in
>LaTeX if you want) but I am against turning it into tagsoup which is
>different issue.

MSIE, with the MathPlayer2 plugin, consumes (well-formed) MathML
fragments embedded in tag-soup ("X")HTML. It does that *TODAY*.

If the idea of MathML in tag soup bothers you, sorry, but it's too
late. That ship has sailed.

David Carlisle

unread,
Oct 3, 2006, 11:29:58 AM10/3/06
to whit...@operamail.com, www-...@w3.org, dev-tec...@lists.mozilla.org

> Well, MSIE does not deal with MathML in any form

This isn't really the case. It's true that if you are using
IE+MathPlayer then the math rendering is being done by an application
produced by Design Science rather than Microsoft, but would you say that
"Opera doesn't deal with applets in any form" just because executing an
applet requires a JDK from sun (or some other Java virtual machine)?
In practice, what a user experiences as "the browser" might be any
number of applications from multiple companies.

IE, for all it's faults, has a rather sensible way of dealing with
extending HTML with XML languages (MathML, SVG, ...). Mozilla, leveraging
off its open source basis, requires the core engine to be extended to
support these languages. IE on the other hand exposes an API that
allows a particular rendering engine to register itself to render
specific XML namespaces. The actual implementation of the idea in IE
unfortunately has some flaws in that it requires explict COM ids being
declared in an object element in the page, and requires a non standard
namespace declaration syntax, However these flaws can be hidden from the
user as long as some guidelines are followed.

> and I am not against
> embededing MathML in environments other then XML (you can embed it in
> LaTeX if you want)

Yes, I once implemented an XML parser in TeX, with that in mind...
http://www.google.co.uk/search?q=xmltex

> but I am against turning it into tagsoup which is
> different issue.

I agree, and this is one of the merits of the IE approach, that I hope
would be seriously considered for mozilla. It isn't necessary for
HTML <4+n> to specify "html-variants" of the various XML languages, _any_
_well formed_ XML fragments can be included, so long as you register the
namespace with the application to bind it to a rendering component. In
IE that binding happens in the html page itself, but it would be better
done at the browser level.

I think that if a simpler linear input form without so much element
markup overhead is required, (and almost certainly it is required)
then something more like
http://www1.chapman.edu/~jipsen/mathml/asciimath.html
is what is wanted (ie, no element markup at all). Asciimath as published
at the above address does the expansion to MathML on the client (so it
is the tex-like syntax that would be served) but an alternative would be
to do the expansions on the server, which is essentially the wiki
approach, allowing you to write 1+x^2 as shorthand for
<mn>1</mn><mo>+</mo>...
just as
* zzz
is shorthand for <ul><li>zzz... in many wiki variants.

David


Jacques Distler

unread,
Oct 3, 2006, 11:56:26 AM10/3/06
to
In article
<mailman.6561.115988941...@lists.mozilla.org>, David
Carlisle <dav...@nag.co.uk> wrote:


>I think that if a simpler linear input form without so much element
>markup overhead is required, (and almost certainly it is required)
>then something more like
>http://www1.chapman.edu/~jipsen/mathml/asciimath.html
>is what is wanted (ie, no element markup at all). Asciimath as published
>at the above address does the expansion to MathML on the client (so it
>is the tex-like syntax that would be served) but an alternative would be
>to do the expansions on the server,

Whether one uses a Wiki-like syntax, or a tex-like syntax, and whether
the expansion to MathML is done server-side (as, say, in blahtex or
itex2MML), or client-side (as in asciimath), it is simply the case that
*no one* hand-authors MathML in a production environment. Hence, I
agree, that the verbosity of MathML is a non-issue.

>The actual implementation of the idea in IE unfortunately has some flaws
>in that it requires explict COM ids being declared in an object element
>in the page, and requires a non standard namespace declaration syntax,
>However these flaws can be hidden from the user as long as some guidelines
>are followed.

With MathPlayer2, these flaws are hidden from the author as well.
IE+MathPlayer2 can consume bog-standard XHTML+MathML documents. (With
the proviso that the "XHTML" is actually treated as tag-soup.)

I also happen to like the IE approach, which requires well-formed
MathML. Since the MathML content (like most SVG content) is produced by
automated tools, it is not too much of a burden to demand (or expect)
that it be well-formed.

William F Hammond

unread,
Oct 3, 2006, 12:30:33 PM10/3/06
to dev-tec...@lists.mozilla.org
Jacques Distler <dis...@golem.ph.utexas.edu> writes:

> In article <1159887321....@c28g2000cwb.googlegroups.com>,
> White Lynx <whit...@operamail.com> wrote:
>
>>> HTML does this all the time. (E.g. inferred <tbody> element as a child
>>> of <table>, inferred <head> and <body> elements,...) There's nothing
>>> wrong with inferred elements ... per se.
>>
>> One thing when you can unamboguously infer completely useless element
>>that has no semantic value
>
> If you can unambiguously infer an element, it matters not a *whit*
> whether it is "useful" or "useless."
>
> I could make the same argument about inferred end tags in HTML.
> (Inferred elements are just a special case, where both the start and
> end tags are optional.)

As I understand it, <tbody> is off point in relation to White Lynx's
concern.

I think he was originally speaking against having entity names like
&dagger; available in a user agent's DOM while they are formally
excluded from an author's content as shipped through the web.

Named or not it's CDATA, and I think it's something of a house of
cards to be making item-by-item decisions on bits of CDATA. It's a
covert resurrection of SDATA. How can one hope for consistency across
various user agents?

>>Well, MSIE does not deal with MathML in any form and I am not against
>>embededing MathML in environments other then XML (you can embed it in
>>LaTeX if you want) but I am against turning it into tagsoup which is
>>different issue.
>
> MSIE, with the MathPlayer2 plugin, consumes (well-formed) MathML
> fragments embedded in tag-soup ("X")HTML. It does that *TODAY*.

I think the idea of an "Appendix C" profile that was mentioned, I
believe, by Jacques Distler is meritorious. I also think it
consistent with what Roger Sidje has said.

I can understand why XML namespace prefixes would be problematical in
HTML 5, but I see no harm allowing suitably profiled content that
would be valid XHTML+MathML when served as "application/xhtml+xml" to
sail also as HTML 5 IF the current discussion makes any sense at all.
In particular, Ian Hixie previously said that xmlns attribute settings
would be a conformance violation. But as I read the current whatwg
spec it would be a violation only of the third kind, and I find that
third clause, along with its table example, not persuasive.

> If the idea of MathML in tag soup bothers you, sorry, but it's too
> late. That ship has sailed.

Oh? Your content? Does Mozilla handle it? Where can we see it?

-- Bill

Juan R.

unread,
Oct 3, 2006, 12:50:48 PM10/3/06
to
Chris Chiasson wrote:
> White Lynx wrote:
> >All other approaches are closesly
> > integrated in some extensible framework with powerful formatting
> > mechanism (LaTeX/TeX, ISO-12083/SGML+DSSSL, OfficeMath/WordML).
>
> I don't know how common the knowledge is, but MathML is closely tied
> with a certain platform: Mathematica
>
> Wolfram Research (makers of Mathematica) was one of the originators of
> MathML. Anyway, present day MathML is strongly related to Mathematica's
> internal representation of math, as shown in this short example.

It is interesting that Nov, 1995 Wolfram Research draft for Math on the
web was never approved. The final Apr, 1998 MathML W3C recommendation,
of course, is not completely unrelated to early Wolfram draft, but is
not the same, somewhat as MathML is not ISO-12083 or TeX even if there
exist some similarities.

> Consider Euler's formula as entered in the most source-code like syntax
> available in Mathematica (called InputForm):
>
> E^(I*x)==Cos[x]+I*Sin[x]
>
> After parsing, it becoms this (called FullForm):
>
> Equal[Power[E,Times[Complex[0,1],x]],Plus[Cos[x],Times[Complex[0,1],Sin[x]]]]
>
> Compare this with content MathML
>
> <math
> xmlns='http://www.w3.org/1998/Math/MathML'><apply><eq/><apply><power/><
> exponentiale/><apply><times/><imaginaryi/><ci>x</ci></apply></apply><apply><
> plus/><apply><cos/><ci>x</ci></apply><apply><times/><imaginaryi/><apply><sin/>
> <ci>x</ci></apply></apply></apply></apply></math>
>
> Notice how <apply> is used to capture the structure of
> head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.

The first you write is a M expression. The second is a xml encoding of
a S expression. They are two different concepts even if you can
transform between both. Moreover, take the LISP/Scheme representation

(head arg1 arg2 arg3).

where the ( ) indicates an application to be evaluated. What is more
close to c-MathML? Lisp or Mathematica?

What is more, so far as i know Mathematica uses M expressions just at
the syntax level not as internal representation.

> So those plentiful <mrow> elements shouldn't be unexpected. Also, it
> becomes pretty apparent why presentation MathML is nearly
> incomprehensible. It is a representation of an already verbose two
> dimensional box formatting system in XML, making it even more verbose.

Are you claiming that the best way to improve comprehensibility whereas
decreasing verbosity of "those plentiful <mrow> elements" may be
promoting a new syntax (Ian's syntax) maintaining just all the mrows
there?

XML was really designed for documents not raw data. MathML is poor
still because data is defined at the token level. It was not needed to
be a genious for computing the order of magnitude on file oversize from
such one approach. Consequences would have been computed _before_
implementing MathML in a native way. Microsoft (embbedded islands more
plugin) and Opera (CSS + JS) movements were much more intelligent.

> IMHE (in my humble estimation) Firefox people would be better off
> trying to define "shorthand" definitions for the content MathML system,
> which WRI will be less likely to oppose.

Does this sense that if FF (Mozilla) do not support content MathML in
either native or plugin way?

Juan R.

unread,
Oct 3, 2006, 1:08:36 PM10/3/06
to
Jacques Distler wrote:
> In article <1159887321....@c28g2000cwb.googlegroups.com>,
> White Lynx <whit...@operamail.com> wrote:
>
> > > I disagree strongly. XHTML is a *huge* barrier. CMS's that reliably
> > > produce XHTML are rare to nonexistent.
> >
> > You tend to turn simple things into rocket science.
>
> Writing a CMS that reliably produces well-formed XHTML is "simple"?
>
> You should write one then. The world will thank you.
>

I agree that the production of XHTML (even strict) is not rocket
science.

With MSIE does not supporting XHTML and Mozilla implementation really
sucking (even Mozilla guys recommend the use of HTML before using XHTML
when you are not benefiting from other XML applications: MathML, SVG,
etc.) there is not commercial interest for first-class XHTML tools and
most of developers simply adapted their previous HTML presentational
algoritms to the X hype.

What is the benefit to write nice XHTML tools for science if after
people as you would introduce all kind of crazy code (incorrect
rendering, extra mrows collapsing Mozilla engine, numbers splinted at
the decimal point, ds^2 being encoded as 2s ds...) when using your
inefficient IteX plugin on the Internet?

However, there exist a couple of XHTML tools generating good code (even
a few already can generate pure strict code) and tools generating very
good MathML code and next year we could -maybe- see to Word generating
XHTML for blogs (it appears that strict code W3C validated is in their
target).

Juan R.

unread,
Oct 3, 2006, 1:18:11 PM10/3/06
to

Jacques Distler wrote:
>
> Whether one uses a Wiki-like syntax, or a tex-like syntax, and whether
> the expansion to MathML is done server-side (as, say, in blahtex or
> itex2MML), or client-side (as in asciimath), it is simply the case that
> *no one* hand-authors MathML in a production environment. Hence, I
> agree, that the verbosity of MathML is a non-issue.

One would not confound asciimath with asciimathJS. Contrary to
itex2MML, asciimath works both at client and server side. Look at the
PHP version

[http://www.jcphysics.com/ASCIIMath/]

Other versions could be developed at the server side.

Since this stuff is already available for years, there is not need for
Ian's mixed syntax, which is still a order of magnitude more verbose
than asciimath, itex, latex... and therefore will remain unpopular.

White Lynx

unread,
Oct 3, 2006, 1:28:36 PM10/3/06
to
> I think he was originally speaking against having entity names like
> &dagger; available in a user agent's DOM while they are formally
> excluded from an author's content as shipped through the web.

No I meant mi, mo, mn token elements.

>> HTML does this all the time. (E.g. inferred <tbody> element as a
child
> >> of <table>, inferred <head> and <body> elements,...) There's
nothing
> >> wrong with inferred elements ... per se.
>
> > One thing when you can unamboguously infer completely useless
element
> >that has no semantic value
>
> If you can unambiguously infer an element, it matters not a *whit*
> whether it is "useful" or "useless."

If. And if so they would not be introduced at all.

>
> I could make the same argument about inferred end tags in HTML.
> (Inferred elements are just a special case, where both the start and

> end tags are optional.)
>


> > > I disagree strongly. XHTML is a *huge* barrier. CMS's that
reliably
> > > produce XHTML are rare to nonexistent.
>
> > You tend to turn simple things into rocket science.
>
> Writing a CMS that reliably produces well-formed XHTML is "simple"?
>
> You should write one then. The world will thank you.

I don't need it (neither CVS nor thanks).

> If the idea of MathML in tag soup bothers you, sorry, but it's too
> late. That ship has sailed.

It is not my fault.

Jacques Distler

unread,
Oct 3, 2006, 2:48:35 PM10/3/06
to
In article <1159896516.3...@i3g2000cwc.googlegroups.com>,
White Lynx <whit...@operamail.com> wrote:

> No I meant mi, mo, mn token elements.
>
>> HTML does this all the time. (E.g. inferred <tbody> element as a child
>> of <table>, inferred <head> and <body> elements,...) There's nothing
>> wrong with inferred elements ... per se.
>>
>>> One thing when you can unamboguously infer completely useless element
>>>that has no semantic value
>>
>> If you can unambiguously infer an element, it matters not a *whit*
>> whether it is "useful" or "useless."
>
> If. And if so they would not be introduced at all.

If one CAN'T unambiguously infer <mo>, <mi> and <mn> elements, then
there's no point in arguing whether it's a good idea to make them
optional. They would *necessarily* (by the design criteria of the
proposal) be required elements.

The only case to argue is *if* (I haven't thought about it, so I'll
assume Ian is correct) they *can* be unabiguously inferred, whether it
is a good idea to do so.

I have argued that it is *not*, but for reasons that have nothing to do
with the alleged semantic value (or lack thereof) of these elements.

In any case, if one followed my proposal of creating a
tag-soup-parser-safe profile of MathML, then there would be no
discussion here. These are required elements in MathML (they are not
inferred); ergo, they would be required elements in any profile of
MathML.

William F Hammond

unread,
Oct 3, 2006, 4:25:44 PM10/3/06
to dev-tec...@lists.mozilla.org
Jacques Distler <dis...@golem.ph.utexas.edu> writes:

> If one CAN'T unambiguously infer <mo>, <mi> and <mn> elements, then
> there's no point in arguing whether it's a good idea to make them
> optional. They would *necessarily* (by the design criteria of the
> proposal) be required elements.
>
> The only case to argue is *if* (I haven't thought about it, so I'll
> assume Ian is correct) they *can* be unabiguously inferred, whether it
> is a good idea to do so.
>
> I have argued that it is *not*, but for reasons that have nothing to do
> with the alleged semantic value (or lack thereof) of these elements.
>
> In any case, if one followed my proposal of creating a
> tag-soup-parser-safe profile of MathML, then there would be no
> discussion here. These are required elements in MathML (they are not
> inferred); ergo, they would be required elements in any profile of
> MathML.

Of course, you are right that inference is not generally possible.

When <math> in HTML5 is to mean, as suggested by Roger Sidje, that the
content is MathML, then its content should be correct MathML -- which
is why the <math> opentag should then bear an xmlns attribute even
though there would be no general XML namespace understanding of
"xmlns" across elements appearing in HTML5.

If HTML5 is going to be reasonable, user agents should have provision
for the special case where a whole HTML5 instance between <html>
and </html> is a valid XHTML or XHTML+MathML instance subject to
Appendix C type profiling rules (which would ban things like xml
namespace prefixes and ask for things like <mspace /> rather than
<mspace/>. But <mi>, <mn>, and <mo> should be mandatory.

However, I think in this discussion and the discussion at whatwg I've
seen the germ of a further idea for a more casual kind of math that
would be reasonable for human authoring. In that situation it would
be reasonable for TeX default handling of symbols to apply, e.g..,
<mi>Hom</mi>(X, Y), <mi>cos</mi> ax <mi>sin</mi> bx -- the point being
that by default every loose character represents a symbol, and strings
of length > 1 are symbols only when enclosed in <mi> or something like
it. I have no idea, however, as to whether the advocates of HTML5 are
prepared to render this more casual kind of markup.

-- Bill

Ian Hickson

unread,
Oct 3, 2006, 5:56:57 PM10/3/06
to David Carlisle, www-...@w3.org, dev-tec...@lists.mozilla.org, whit...@operamail.com
On Tue, 3 Oct 2006, David Carlisle wrote:
>
> I agree, and this is one of the merits of the IE approach, that I hope
> would be seriously considered for mozilla. It isn't necessary for HTML
> <4+n> to specify "html-variants" of the various XML languages, _any_
> _well formed_ XML fragments can be included, so long as you register the
> namespace with the application to bind it to a rendering component.

What are the rules for handling non-well-formed content? (Could you show
me an example of this? Different people seem to mean different things
when they talk about IE's extension models.)

--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Ian Hickson

unread,
Oct 3, 2006, 6:19:46 PM10/3/06
to Bruce Miller, www-...@w3.org, dev-tec...@lists.mozilla.org
On Tue, 3 Oct 2006, Bruce Miller wrote:
>
> In reference to "CMS" not reliably support XHTML, (depending on exactly
> what is meant by "CMS"), most web tools[**] seem unable to reliably
> generate (conformant) HTML either -- in the current web, there's little
> motivation. Again, it's hard to see that HTML5 will improve that.

HTML5 will not improve people's authoring skills. However, it will
(hopefully, at least!) improve the interoperability of UAs when handling
broken pages -- with HTML5 we no longer have "tag soup", because every
stream of input characters maps to a single well-defined DOM. There's no
more guesswork involved.

Roger B. Sidje

unread,
Oct 3, 2006, 6:27:03 PM10/3/06
to White Lynx, www-...@w3.org, dev-tec...@lists.mozilla.org
These comments drift the discussion to tangential topics not in the
original proposal. I think I have pretty much answered all these points.
To briefly re-iterate a few:

W3C can continue to define all other XML formats. (These seldom need
special rendering and are unlikely to be fed to the browser.)
XML/XHTML+MathML is done with and will remain so. Little else to gain
there. Interoperability with the XML production line is there.

HTML5+MathML remains a big uncharted territory for Gecko, and ideally I
would hope for MathML to work there too (pretty much like in
IE+MathPlayer). [HTML5 is anything sent as text/html, so digestible by
IE too.]

Tag-soup is being over-emphasized because it attracts attention,
ignoring (when it suits you) that the reality is different, owing to
automatic generation.
---
RBS

On 3/10/2006 5:18 PM, White Lynx wrote:

>
>>Damaging to what? To MathML? Not really in my opinion. What damage could
>>there be to have plenty of MathML formulas on the web?!?
>
>
> What prevents you from having plenty of formulae on web today? Do we have at least one MathML implementation that supports HTML, but lacks XHTML support? Do we have MathML implementations that support XHTML only? So, how introducing two different and incompatible parsing rules will improve interoperability? And assume that you have plenty of formulae on web and you want to process them. How having half of
> them in tagsoup and another half in XML does not make them easier to handle?
>
>
>>But to the
>>XML/XHTML agenda, possibly. And that has been the real "problem" since
>>the beginning, and which I alluded to in my opening post.
>
>
> It is not the beggining. Seven years passed since that time and a lot of XML applications emerged since then. Most of current W3C are designed keeping in mind XML and not SGML or HTML. MathML is part of large and extensible framework where it can be combined with other XML applications. Current proposal does adds no new functionality to MathML, but rather artificially splits MathML community into incompatible parts that has to be delt separately.
>
>
>>Interested in using MathML? First pass that XHTML barrier, and that
>>wasn't even a small barrier. It was a significant barrier, taking seven
>>years before IE understood application/xhtml+xml.
>
>
> It was. But it is not anymore. So it is not clear what are you struggle with. Maybe someone has to struggle with legacy text/html content, but it is not our problem we have no MathML in HTML legacy. Maybe someone complaints that MSIE does not support application/xhtml+xml, again it is not our problem as without MathPlayer MSIE can not process MathML while with MathPlayer application/xhtml+xml problem is N/A.
> If someone doubts about future of XML in MSIE, note that Microsoft's own mathematical markup language is (and most of other recent format$ are) entirely XML based.
>
>
>>MathML already works in XML/XHTML and this proposal is not going to
>>break that.
>
>
> XML for maths means better interoperability (and extensibility) this proposal splits MathML into two different versions
>
>

>>This might also
>>encourage those building HTML authoring tools to consider interfacing
>>MathML (either with free or commercial plug-ins) because the XML/XHTML
>>barrier won't be standing right at their face.
>
>

> Once again there is no barrier, XHTML has all the functionality that HTML has and much more. The only issue is MSIE parser and as noted above several times this issue is N/A to MathML today.


>
>
>>Many math pages wouldn't have bothered
>>with XHTML if it had been possible to have MathML in HTML
>
>
> Which means that goint in that direction will give rise to two different versions of MathML, damaging interoperability and introducing no new functionality.
>
>
>>MathML-in-HTML5? Worth a try.
>
>
> Once you try something you can't always untry it. Just proceed
> with you proposal and we will have to strugle with text/html legacy forever.

[...inconsistency elsewhere...]

> Well, MSIE does not deal with MathML in any form and I am not against
> embededing MathML in environments other then XML (you can embed it in

> LaTeX if you want) but I am against turning it into tagsoup which is a
> different issue.

David Carlisle

unread,
Oct 3, 2006, 6:33:08 PM10/3/06
to i...@hixie.ch, www-...@w3.org, dev-tec...@lists.mozilla.org

Ian,

> What are the rules for handling non-well-formed content?
not sure what the "rules" are (as in whether they are published
anywhere), perhaps someone from DS (or Microsoft for that matter) could
give more information, but empirically what I think happens is that
if you register the namespce on m: with a component then any top level
element (<m:math>..</m:math> in our case) gets handed over to the
component, well formed or not, and then it's up to the component what it
does with it. Mathplayer for example usually tries to make something
out of incorrect (including non well formed) content, but always
renders it in a red error box.

> (Could you show me an example of this?

yes.

Taking the example whose markup is shown and described here

http://www.dessci.com/en/products/mathplayer/author/creatingpages.htm#AnatomyMathPlayerWebPage

I've cut out the example and made it into a file served here:
http://www.dcarlisle.demon.co.uk/mml1.html

Here's the same file made gratuitously non-well formed (all end tags
made into start tags)

http://www.dcarlisle.demon.co.uk/mml2.html

It renders in red, and if you use the right menu to "copy mathml"
and paste the copied markup into a text editor you will see

<math>
<!-- Error encountered and repaired:
Too few children in <msup> node
-->
<msup>
<!-- Error encountered and repaired:
Too many children in <mi> node
-->
<mi>x 2 + 9 x + 9 = 0</mi>
<mrow>
</mrow>
</msup>
</math>


So basically in the case of mathplayer any non well formed text is
displayed as an error, although I suspect that isn't enforced by the API
exposed by IE for XML fragments.

I should stress I'm just a user here I have no inside knowledge about
any of the components being discussed.

David

Paul Topping

unread,
Oct 3, 2006, 7:08:02 PM10/3/06
to David Carlisle, i...@hixie.ch, www-...@w3.org, dev-tec...@lists.mozilla.org
Just to confirm, David is correct, MathPlayer treatment of bad input
(ie, malformed MathML) by putting it in a red box is its own invention.
IE plays no role. MathPlayer is just given a DOM tree rooted at the
<math> node and it can do as it pleases. (Actually, MathPlayer has
access to the entire DOM and more if it wants it.) One consequence of
this is that (a) MathPlayer doesn't have to implement an XML parser and
(b) it can't correct any limitations or errors in its parsing.

Paul

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf
> Of David Carlisle
> Sent: Tuesday, October 03, 2006 3:33 PM
> To: i...@hixie.ch
> Cc: www-...@w3.org; dev-tec...@lists.mozilla.org
> Subject: Re: MathML-in-HTML5
>
>

Ian Hickson

unread,
Oct 3, 2006, 7:19:41 PM10/3/06
to David Carlisle, www-...@w3.org, dev-tec...@lists.mozilla.org
On Tue, 3 Oct 2006, David Carlisle wrote:
> >
> > What are the rules for handling non-well-formed content?
>
> not sure what the "rules" are (as in whether they are published
> anywhere), perhaps someone from DS (or Microsoft for that matter) could
> give more information, but empirically what I think happens is that if
> you register the namespce on m: with a component then any top level
> element (<m:math>..</m:math> in our case) gets handed over to the
> component, well formed or not, and then it's up to the component what it
> does with it. Mathplayer for example usually tries to make something
> out of incorrect (including non well formed) content, but always
> renders it in a red error box.

So basically, it's the same as tag soup. I don't really see an advantage
to going down that route (with its complexities like namespace prefixes,
etc) -- if anything, the lack of success of IE's extension mechanism
should be taken as a sign that this is not the path to follow. (Compare
this to, e.g., <marquee>, which was so widely used that other browsers
were forced to support it.)

Roger B. Sidje

unread,
Oct 3, 2006, 7:41:16 PM10/3/06
to Ian Hickson, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
On 4/10/2006 9:19 AM, Ian Hickson wrote:
>
> So basically, it's the same as tag soup. I don't really see an advantage
> to going down that route (with its complexities like namespace prefixes,
> etc)

One could also think that prefixed tags are random tags in general.

Is it because you are thinking globally w.r.t. multiple mixings? Such as
<m:tag> <n:tag>...</n:tag> </m:tag>?

(I am not sure what IE does if n: is attached to a plug-in.)

If it just for the single MathML containment, the starting patch that I
made emulates what IE+MathPlayer does:

<html xmlns:m="mathml-namespace">, then
<m:math>...</m:math> in the document

or

<math xmlns="mathml-namespace">...</math>
with no need for a declaration in <html>

(So it is a proof-of-concept of that emulation, and as the patch showed,
not that invasive to emulate.)
---
RBS

Ian Hickson

unread,
Oct 3, 2006, 7:43:25 PM10/3/06
to Roger B. Sidje, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
On Wed, 4 Oct 2006, Roger B. Sidje wrote:
>
> On 4/10/2006 9:19 AM, Ian Hickson wrote:
> >
> > So basically, it's the same as tag soup. I don't really see an
> > advantage to going down that route (with its complexities like
> > namespace prefixes, etc)
>
> Is it because you are thinking globally w.r.t. multiple mixings?

No. It's just that anything with namespaces causes authors confusion. I
don't see the advantages here outweighing the disadvantages.

Paul Topping

unread,
Oct 3, 2006, 7:48:28 PM10/3/06
to Ian Hickson, David Carlisle, www-...@w3.org, dev-tec...@lists.mozilla.org
That fact that adding XML islands to tag soup doesn't turn it into steak
shouldn't surprise anyone. But I don't think it is fair to dismiss the
value of adding MathML to HTML on that basis. I think Microsoft provided
a very simple mechanism that allows XML islands inside tag soup but
without making it more soupy, to stretch the metaphor a bit.

I'm not sure what you mean by "lack of success of IE's extension
mechanism". If you mean that Mozilla didn't adopt it, then you are
right. Given that the world's web content is largely tag soup, at no
fault of Microsoft that I know of, they added XML islands to it in a
straightforward way that could have been adopted by all other browsers.
I never heard anyone say that Mozilla didn't adopt it because it was
implemented badly. They just put all their eggs in the XHTML basket
hoping it would drive content producers to use XHTML. If anything shows
lack of success, it is XHTML.

Paul

> -----Original Message-----
> From: dev-tech-ma...@lists.mozilla.org
> [mailto:dev-tech-ma...@lists.mozilla.org] On Behalf
> Of Ian Hickson
> Sent: Tuesday, October 03, 2006 4:20 PM
> To: David Carlisle
> Cc: www-...@w3.org; dev-tec...@lists.mozilla.org
> Subject: Re: MathML-in-HTML5
>

> On Tue, 3 Oct 2006, David Carlisle wrote:
> > >
> > > What are the rules for handling non-well-formed content?
> >
> > not sure what the "rules" are (as in whether they are published
> > anywhere), perhaps someone from DS (or Microsoft for that
> matter) could
> > give more information, but empirically what I think happens
> is that if
> > you register the namespce on m: with a component then any top level
> > element (<m:math>..</m:math> in our case) gets handed over to the
> > component, well formed or not, and then it's up to the
> component what it
> > does with it. Mathplayer for example usually tries to make
> something
> > out of incorrect (including non well formed) content, but always
> > renders it in a red error box.
>

> So basically, it's the same as tag soup. I don't really see
> an advantage
> to going down that route (with its complexities like
> namespace prefixes,

> etc) -- if anything, the lack of success of IE's extension mechanism
> should be taken as a sign that this is not the path to
> follow. (Compare
> this to, e.g., <marquee>, which was so widely used that other
> browsers
> were forced to support it.)
>

> --
> Ian Hickson U+1047E
> )\._.,--....,'``. fL
> http://ln.hixie.ch/ U+263A /, _.. \
> _\ ;`._ ,.
> Things that are impossible just take longer.
> `._.-(,_..'--(,_..'`-.;.'

Chris Chiasson

unread,
Oct 3, 2006, 7:54:11 PM10/3/06
to
Juan R. wrote:
> It is interesting that Nov, 1995 Wolfram Research draft for Math on the
> web was never approved. The final Apr, 1998 MathML W3C recommendation,
> of course, is not completely unrelated to early Wolfram draft, but is
> not the same, somewhat as MathML is not ISO-12083 or TeX even if there
> exist some similarities.

I don't get it. I demonstrated strong similarities. I didn't say that
MathML isomorphic to Mathematica syntax. What are you trying to refute?
Do we really disagree here?

> > Notice how <apply> is used to capture the structure of
> > head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.
>
> The first you write is a M expression. The second is a xml encoding of
> a S expression. They are two different concepts even if you can
> transform between both. Moreover, take the LISP/Scheme representation
>
> (head arg1 arg2 arg3).
>
> where the ( ) indicates an application to be evaluated. What is more
> close to c-MathML? Lisp or Mathematica?
>
> What is more, so far as i know Mathematica uses M expressions just at
> the syntax level not as internal representation.

Neither do I know what an S expression is nor do I know what an M
expression is.

Internally, a Mathematica expression is a one dimensional array of
pointers to other expressions, symbols, strings, or numbers, as shown
here:

http://documents.wolfram.com/mathematica/book/section-A.9.2

The first position in the array is called the head, and the normal
formatting for an expression is as I showed: head[arg1,arg2,arg3]. If
the formatting were different, such as (head arg1 arg2 arg3), this
would probably require a change in parsing and syntax rules, but the
internal representation would be the same.

> > So those plentiful <mrow> elements shouldn't be unexpected. Also, it
> > becomes pretty apparent why presentation MathML is nearly
> > incomprehensible. It is a representation of an already verbose two
> > dimensional box formatting system in XML, making it even more verbose.
>
> Are you claiming that the best way to improve comprehensibility whereas
> decreasing verbosity of "those plentiful <mrow> elements" may be
> promoting a new syntax (Ian's syntax) maintaining just all the mrows
> there?

The word best never appeared in my description of presentation MathML,
so no I am not claiming it is the best way to represent typeset math. I
am not really sure what you mean in the rest of your paragraph.

> XML was really designed for documents not raw data. MathML is poor
> still because data is defined at the token level. It was not needed to
> be a genious for computing the order of magnitude on file oversize from
> such one approach. Consequences would have been computed _before_
> implementing MathML in a native way. Microsoft (embbedded islands more
> plugin) and Opera (CSS + JS) movements were much more intelligent.

I don't know what you mean here. I will say that if one wants only the
semantics to be stored in a document, then one should use content
MathML. Presentation MathML comes with a lot of "data baggage" because
it is designed to tell a renderer how to format a given piece of math.
By allowing all kinds of shorthand presentation notation, the
renderer's job becomes more difficult.

> Does this sense that if FF (Mozilla) do not support content MathML in
> either native or plugin way?

Again, I do not know exactly what you mean here.

David Carlisle

unread,
Oct 3, 2006, 8:02:40 PM10/3/06
to i...@hixie.ch, www-...@w3.org, dev-tec...@lists.mozilla.org

> So basically, it's the same as tag soup.
Not sure I understand that comment (given that incorrect input is
flagged as an error rather than being silently accepted) but let that
pass.

> I don't really see an advantage to going down that route (with its
> complexities like namespace prefixes, etc)

I was only suggesting the "route" in so far as the general idea of an
extension _mechanism_ that allows XML fragments (with a declared
rendering behaviour) rather than a fixed set of extensions, also
(if a fixed set of say html+mathml+svg is to be used) that
at least the whatwg is aware of the IE approach and should consider
the possibility of defining things such that pages can be written to
work with both mechanisms.Backward compatibility with existing browsers
and existing pages is I know a major issue with html5 and there have
been mathml-in-html pages since IE 5.5 which is a long time.. Of course
in terms of number of pages as a proportion of the html web it's a very
small proportion, but still...

Irrespective of whether there is a general extension mechanism (which
mathml then uses) or whether mathml has a privileged position as a
"built in" extension, it seems there are three possible options

1) each <math>..</math> fragment has to be well formed xml (eg it's
broken out and parsed by a real non-validating xml parser rather than
than the html parser)

2) the math element is parsed by the html parser using a specifically
extended version of the "html 5" parsing algorithm, resulting in
a DOM that would be the same as if an xhtml+mathml document had been
parsed by an XML paser.

3) the <math> element syntax in html includes some syntax forms that
result in a DOM structure that doesn't match that of MathML.

Orthogonal to those three choices is the issue of whether to subset
(profile) mathml: all mathml?, all presentation mathml? a subset of
presentation mathml?, a subset, but including some mathml3 elements
(as yet unspecified, but Roger for example highlighted equation
labelling as something that should be looked at, which might lead to
wanting to add some features in MathML3 that are included in a profile)

Of the three numbered choices I _think_ I have numbered them in order of
preference. I'd be definitely opposed to (3) as that would I think be
specifying a different language that happens to reuse the name mathml
which would be confusing for everyone.

Currently I place (2), which is I think what is being suggested for mozilla,
as 2nd preference but I could be persuaded that that is the best
solution, especially if the "html" parser would allow (if not enforce)
_all_ the relevant xml syntax, especially empty element syntax />
(mathml has a lot of empty elements, although mostly that is in content
mathml) and namespace declarations. Even if they are ignored they
shouldn't be an error. pretty much all mathml is generated by tools or
mechanical assistance of some kind, and if those tools are using xml
syntax (as they are) then it won't always be easy for an end user to
"correct" that mechanically generated markup and replace xml idioms by
"html" ones.

Then of course there's the perennial question about what to do with those
entitiy definitions... (I suppose "can I give them to someone else" is
not an allowed answer:-)

David

David Carlisle

unread,
Oct 3, 2006, 8:13:00 PM10/3/06
to r...@maths.uq.edu.au, www-...@w3.org, dev-tec...@lists.mozilla.org, i...@hixie.ch

> That example wasn't suggesting anything special about the literal "m"
> indeed -- what it instead suggested was that the prefix string had to be
> declared in the <html>. Right?

Yes, the prefix is arbitrary.
In IE (in html mode) it has to be there and the binding of the prefix to
the rendering component is explicit in the page. I don't think anyone
would want mozilla to have either of those restrictions. If though it
_allowed_ a prefix on <m:math> in html mode it would enable pages to work
in ie and mozilla at the same time, which is never a bad thing, really.

David

Ian Hickson

unread,
Oct 3, 2006, 8:28:18 PM10/3/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
On Tue, 3 Oct 2006, Paul Topping wrote:
>
> That fact that adding XML islands to tag soup doesn't turn it into steak
> shouldn't surprise anyone. But I don't think it is fair to dismiss the
> value of adding MathML to HTML on that basis. I think Microsoft provided
> a very simple mechanism that allows XML islands inside tag soup but
> without making it more soupy, to stretch the metaphor a bit.

I'm not saying don't add MathML to HTML. I'm saying don't add namespace
syntax to HTML. I'm eagerly looking forward to seeing what Roger's
experience with adding MathML to HTML is.


> I'm not sure what you mean by "lack of success of IE's extension
> mechanism".

It doesn't have millions of pages using it, the way that other IE
non-standard extensions (e.g. <marquee>) have taken off.

Paul Topping

unread,
Oct 3, 2006, 8:36:12 PM10/3/06
to Ian Hickson, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
The type of the XML island must be declared somehow or it would
certainly be making more tag soup. As has been stated by other posters,
you either have to duplicate MathML (or some subset of it) in HTML5 or
you have to connect it to the MathML definition. The way one references
an XML definition is by declaring its namespace. One could define that
namespace implicitly in HTML5 by stating in the HTML5 spec that <math>
means MathML 2.0 or whatever. Of course, this locks MathML to one
specific version -- not a good thing.

Paul

Ian Hickson

unread,
Oct 3, 2006, 8:38:44 PM10/3/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
On Tue, 3 Oct 2006, Paul Topping wrote:
>
> The type of the XML island must be declared somehow or it would
> certainly be making more tag soup. As has been stated by other posters,
> you either have to duplicate MathML (or some subset of it) in HTML5 or
> you have to connect it to the MathML definition. The way one references
> an XML definition is by declaring its namespace. One could define that
> namespace implicitly in HTML5 by stating in the HTML5 spec that <math>
> means MathML 2.0 or whatever. Of course, this locks MathML to one
> specific version -- not a good thing.

The proposal that I understand Roger intends to experiment with is making
any tag in a particular list of tags be added to the DOM as a node not in
the XHTML namespace but in the MathML namespace.

The MathML namespace doesn't change between versions, since MathML is
backwards compatible across versions. Thus there is no "lock-in" problem,
just like there is no problem of HTML content being "locked" to a
particular version of HTML.

Paul Topping

unread,
Oct 3, 2006, 8:45:49 PM10/3/06
to Ian Hickson, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
While backward compatibility across versions of MathML is obviously
desirable, I believe that the creators of future versions of MathML will
feel free to change the definition of something, deprecate it, or remove
it in the new version. Backward compatibility can and should be provided
in the MathML renderer but it can only do so if it can identify the
version of the MathML content it is processing.

Paul

> -----Original Message-----
> From: Ian Hickson [mailto:i...@hixie.ch]
> Sent: Tuesday, October 03, 2006 5:39 PM
> To: Paul Topping
> Cc: David Carlisle; www-...@w3.org; dev-tec...@lists.mozilla.org
> Subject: RE: MathML-in-HTML5
>

Ian Hickson

unread,
Oct 3, 2006, 9:13:32 PM10/3/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org, David Carlisle
On Tue, 3 Oct 2006, Paul Topping wrote:
>
> While backward compatibility across versions of MathML is obviously
> desirable, I believe that the creators of future versions of MathML will
> feel free to change the definition of something, deprecate it, or remove
> it in the new version. Backward compatibility can and should be provided
> in the MathML renderer but it can only do so if it can identify the
> version of the MathML content it is processing.

IMHO backwards compatibility is not just "desirable", it is absolutely and
fundamentally critical. Specification authors should always make their
specs backward and forward looking. Yes, this constrains what the spec
authors can do. That's life.

BTW, UAs can't tell what version of content they are handling, even if the
language provides a way to tell that, because authors regularly lie (just
have a look at HTML DOCTYPEs around the Web, or SVG versions).

sha...@shantirao.com

unread,
Oct 4, 2006, 1:48:50 AM10/4/06
to
Howdy,

For the record, I am a rocket scientist, and I have no idea how to write
XHTML, because I have a zillion other more important things to do than
learn about XHTML. The phrase we usually use is, "it's not brain surgery
...", but then I met a brain surgeon, and he was parsimonious with his
time, too.

Here are some binary decisions you can make to decide where you fall in
the discussion:

1. Do you believe that MathML should only be allowed in well-formed XML
documents?
[ ] Yes [ ] No

2. Do you believe that MathML should be parsed by a web browser (as
opposed to be being presented as opaque source?)
[ ] Yes [ ] No

3. Do you fear complexity?
[ ] Yes [ ] No

4. Do you fear losing control over MathML?
[ ] Yes [ ] No

Shanti

Chris Chiasson

unread,
Oct 4, 2006, 2:19:21 AM10/4/06
to
For your statement about XHTML to be relevant, I have to assume you
know how to write HTML. Since you are on the MathML forum, I assume you
know how to write or produce MathML in some manner. What's so hard
about writing XHTML after you know how to do the other things?

By the way, I think the thread originator should just go ahead an
implement MathML in HTML. I don't know who is going to use it, but if
it doesn't take much away from other development and bug fixing
resources, why not go ahead?

Of course, I am not sure my opinion counts...

Juan R.

unread,
Oct 4, 2006, 3:31:20 AM10/4/06
to
Chris Chiasson wrote:
> Juan R. wrote:
> > It is interesting that Nov, 1995 Wolfram Research draft for Math on the
> > web was never approved. The final Apr, 1998 MathML W3C recommendation,
> > of course, is not completely unrelated to early Wolfram draft, but is
> > not the same, somewhat as MathML is not ISO-12083 or TeX even if there
> > exist some similarities.
>
> I don't get it. I demonstrated strong similarities. I didn't say that
> MathML isomorphic to Mathematica syntax. What are you trying to refute?
> Do we really disagree here?

MathML is an effort of several people including at Wolfram. You first
claimed that

> Anyway, present day MathML is strongly related to Mathematica's
> internal representation of math, as shown in this short example.

and next claimed that

> Anyway, I don't speak for WRI, but I think it's fairly obvious they
> will try to keep MathML "in their image" so that it will be easy for
> them to have an XML language for math that is understood by machines
> ... aka their computer algebra system.

The MathML example you provided is more close to LISP/Scheme than
Mathematica

head[arg1,arg2,arg3] vs

<apply><head/><arg1/><arg2/><arg3/></apply> vs

(head arg1 arg2 arg3)

( ==> <apply> and ) ==> </apply>

Therefore I do not know why you claim that WRI will try to block
MathML.

> > > Notice how <apply> is used to capture the structure of
> > > head[arg1,arg2,arg3] as <apply><head/><arg1/><arg2/><arg3/></apply>.
> >
> > The first you write is a M expression. The second is a xml encoding of
> > a S expression. They are two different concepts even if you can
> > transform between both. Moreover, take the LISP/Scheme representation
> >
> > (head arg1 arg2 arg3).
> >
> > where the ( ) indicates an application to be evaluated. What is more
> > close to c-MathML? Lisp or Mathematica?
> >
> > What is more, so far as i know Mathematica uses M expressions just at
> > the syntax level not as internal representation.
>
> Neither do I know what an S expression is nor do I know what an M
> expression is.

But then do not claim that you claimed initially.

> Internally, a Mathematica expression is a one dimensional array of
> pointers to other expressions, symbols, strings, or numbers, as shown
> here:
>
> http://documents.wolfram.com/mathematica/book/section-A.9.2
>
> The first position in the array is called the head, and the normal
> formatting for an expression is as I showed: head[arg1,arg2,arg3]. If
> the formatting were different, such as (head arg1 arg2 arg3), this
> would probably require a change in parsing and syntax rules, but the
> internal representation would be the same.

Then you confirm me that they are not using M expressions as internal
representation, just at the syntax level.

Juan R.

unread,
Oct 4, 2006, 3:44:03 AM10/4/06
to

David Carlisle wrote:
>
> Currently I place (2), which is I think what is being suggested for mozilla,
> as 2nd preference but I could be persuaded that that is the best
> solution, especially if the "html" parser would allow (if not enforce)
> _all_ the relevant xml syntax, especially empty element syntax />
> (mathml has a lot of empty elements, although mostly that is in content
> mathml) and namespace declarations. Even if they are ignored they
> shouldn't be an error. pretty much all mathml is generated by tools or
> mechanical assistance of some kind, and if those tools are using xml
> syntax (as they are) then it won't always be easy for an end user to
> "correct" that mechanically generated markup and replace xml idioms by
> "html" ones.

I read this criticism before!

> Then of course there's the perennial question about what to do with those
> entitiy definitions... (I suppose "can I give them to someone else" is
> not an allowed answer:-)

The same!

> David

[http://lists.w3.org/Archives/Public/www-math/2006Oct/0000.html]

[http://canonicalscience.blogspot.com/2006/09/another-opportunity-lost-to-sensible.html]

Chris Chiasson

unread,
Oct 4, 2006, 3:50:48 AM10/4/06
to
If you want to say ( head arg1 arg2 arg3 ) is closer to
<apply/><head/><arg1/><arg2/><arg3/><apply/> than head[arg1,arg2,arg3],
I will not contradict you.

I also believe it is likely that all of these syntaxes have the same
kind of tree structure in computer memory.

My main claim was that if changes are made to presentation MathML that
make it difficult to derive the Mathematica box form from them, that
WRI will try to block the changes. In light of comments by Paul
Topping, that may no longer be the case due to a general lack of WRI's
participation in the MathML working group. If so, I imagne that the new
syntax would simply be unsupported by Mathematica.

Juan R.

unread,
Oct 4, 2006, 4:12:32 AM10/4/06
to

Ian Hickson wrote:
> On Tue, 3 Oct 2006, Paul Topping wrote:
> >
> > That fact that adding XML islands to tag soup doesn't turn it into steak
> > shouldn't surprise anyone. But I don't think it is fair to dismiss the
> > value of adding MathML to HTML on that basis. I think Microsoft provided
> > a very simple mechanism that allows XML islands inside tag soup but
> > without making it more soupy, to stretch the metaphor a bit.
>
> I'm not saying don't add MathML to HTML. I'm saying don't add namespace
> syntax to HTML. I'm eagerly looking forward to seeing what Roger's
> experience with adding MathML to HTML is.
>

This is MathML

<mrow>
<mn>2</mn>
<mo>+</mo>
<mn>5</mn>
</mrow>

This is also

<m:mrow>
<m:mn>2</m:mn>
<m:mo>+</m:mo>
<m:mn>5</m:mn>
</m:mrow>

by _rejecting_ last one you cannot claim that you are implementing
MathML (as a whole) in HTML 5 for spreading authoring and web
publishing.

Moreover, whereas initial and basic MathML tools do not prefix, so far
as i can see any recent minimally important XML language or software
supporting MathML uses prefixes. The mml prefix is recommended by
several languages, authoring sites, and is default in several DTDs and
Schema i know.

>From memory (i do not check): context, Elsevier CEP 5.2, Docbook...

See also

[http://dtd.nlm.nih.gov/publishing/tag-library/2.0/n-mag0.html]

If i remember MathPlayer (and ASCIIMath JS) uses m: in IE instead mml:

> --
> Ian Hickson U+1047E )\._.,--....,'``. fL
> http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
> Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

William F Hammond

unread,
Oct 4, 2006, 2:36:38 PM10/4/06
to www-...@w3.org, dev-tec...@lists.mozilla.org
Robert Miner writes:

> ... However, note that currently the only way to get IE to hand off
> the MathML to a plugin like MathPlayer is to use a namespace prefix.

A much easier and quicker solution would be for the IE folk simply to
include the MathPlayer plugin with every copy of IE.

Absent that I doubt if we will really have interoperability.

-- Bill

William F Hammond

unread,
Oct 4, 2006, 2:42:50 PM10/4/06
to www-...@w3.org, dev-tec...@lists.mozilla.org
Also Robert Miner writes:

> On the other hand, if in HTML5 you permit markup like
>
> <html>
> ...
> <p>Consider the the case where
> <m:math><m:mi>n</m:mi><m:mo>=</m:mo><m:mn>2</m:mn></m:math>
> ...
> </html>

Are you sure that the elements inside <m:math> require prefixes?
Hasn't everything inside <m:math> been handed off to the plugin?

-- Bill

Robert Miner

unread,
Oct 4, 2006, 3:04:02 PM10/4/06
to William F Hammond, www-...@w3.org, dev-tec...@lists.mozilla.org
Hi.

I'm not completely sure about this. It doesn't work today in
MathPlayer, but we might be able to get at those elements if we tried.
I'll investigate.

--Robert

Robert Miner
Director, New Product Development

Design Science, Inc.
140 Pine Avenue, 4th Floor
Long Beach, California 90802
USA
Tel: (651) 223-2883
Fax: (651) 292-0014
rob...@dessci.com
www.dessci.com
~ Makers of MathType, MathFlow, MathPlayer, WebEQ, Equation Editor,
TexAide ~

Ian Hickson

unread,
Oct 4, 2006, 3:39:01 PM10/4/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org
On Tue, 3 Oct 2006, Paul Topping wrote:
>
> It's not clear to me what you would expect to happen if the browser did
> "syntax checking". Just reject the document and display nothing? No one
> would want an HTML browser to do that.

Well, you and I wouldn't want that, but it isn't true to say that no-one
would want that. In fact there is a huge community of people who keep
asking for browsers to reject invalid content. The XML language is
entirely written around the concept that invalid content must be rejected.


> Even tag soup can be validated, it's just that the "standard" is the
> defacto one defined by the most popular browsers.

Well, we're changing that with HTML5. But yes.

Ian Hickson

unread,
Oct 4, 2006, 7:02:30 PM10/4/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org
On Wed, 4 Oct 2006, Paul Topping wrote:
>
> I sense some sort of conflicting themes here or perhaps I'm just
> confused. Your earlier comments made me think that HTML 5 might be about
> stronger validation

I don't really understand what this means. Stronger than what? In what
sense?


> as you were worried about what MathPlayer might do with bad markup and
> suggested that refusing to render the document might be the right
> response.

I was merely pointing out that the term "XML islands" suggests XML-like
processing, which would imply draconian error handling. I wasn't trying to
imply that this was the better solution.


> So, this made me wonder what HTML 5 really was supposed to be. The name
> would imply that it is HTML's tag soup extended with some new stuff like
> MathML and, perhaps, with some of the worst soup removed if it was
> deemed unnecessary to compatibility with all the HTML out in the world
> and the tools that make it. I would also assume that since your WHATWG
> document (http://whatwg.org/specs/web-apps/current-work/) seems to
> distinguish between XHTML5 and HTML5 that they are versions of XHTML and
> HTML enhanced in parallel ways. Am I wrong?

The WHATWG Web Apps 1.0 specification defines a set of features for Web
browsers (mostly existing features previously defined in HTML4 and DOM2
HTML, or implemented as proprietary extensions, though there are some new
features as well). Most features are described in terms of DOM processing
rules, e.g. new DOM interfaces or new rules for handling certain elements
in DOM trees. In addition, it defines two serialisation syntaxes for
representing documents/applications that use these features. One of these
serialisations is just XML (with namespaces); some components of which are
to be in the XHTML namespace and are therefore known as XHTML5. The other
serialisation is a custom language known as HTML5; the specification
defines very specific parsing rules (including error handling rules) for
how to obtain a DOM tree from an HTML5 file.

In the context of HTML5 the term "tag soup" is meaningless, since there
is no UA-defined handling anymore, the spec defines all handling (in an
attempt to foster increased interoperability).

HTH,

Paul Topping

unread,
Oct 4, 2006, 8:31:50 PM10/4/06
to Ian Hickson, www-...@w3.org, dev-tec...@lists.mozilla.org
So do you expect browsers like Mozilla and IE to accept HTML5 and handle
it as defined by your spec? I'm going to guess that your answer is yes
for Mozilla but no or "not my problem" for IE. If so, won't Mozilla have
to have 3 parsers and renderers, one for tag-soup HTML, one for XHTML,
and one for HTML5? Perhaps XHTML and HTML5 can share parsers and perhaps
HTML5 can be rendered by Gecko as is, thereby reducing the number of
pieces somewhat.

Regardless of whether the above is right or wrong, it sounds like you
are saying that adding MathML or <math> to HTML5 is not going to give us
MathML support in everyday tag soup HTML. At the same time, I hear that
Roger Sidje is talking about enhancing Mozilla so that his MathML
renderer will work in everyday tag soup HTML. So perhaps you guys are
talking about two completely separate things in this thread. Do I have
it right?

Paul

> -----Original Message-----
> From: Ian Hickson [mailto:i...@hixie.ch]
> Sent: Wednesday, October 04, 2006 4:03 PM
> To: Paul Topping
> Cc: www-...@w3.org; dev-tec...@lists.mozilla.org
> Subject: RE: MathML-in-HTML5
>

Ian Hickson

unread,
Oct 4, 2006, 8:44:41 PM10/4/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org
On Wed, 4 Oct 2006, Paul Topping wrote:
>
> So do you expect browsers like Mozilla and IE to accept HTML5 and handle
> it as defined by your spec?

In due course. It won't happen overnight by any means; indeed the spec is
far from ready and there are several major known issues with the spec as
currently written.


> I'm going to guess that your answer is yes for Mozilla but no or "not my
> problem" for IE.

If IE doesn't get on board, the whole affair is rather a waste of time.
The browser with the largest chunk of market share is our biggest problem.


> If so, won't Mozilla have to have 3 parsers and renderers, one for
> tag-soup HTML, one for XHTML, and one for HTML5?

The "tag soup HTML" parser is the HTML5 parser (or vice versa, depending
on how you look at it). The "XHTML" parser is just the XML parser.


> Regardless of whether the above is right or wrong, it sounds like you
> are saying that adding MathML or <math> to HTML5 is not going to give us
> MathML support in everyday tag soup HTML.

I'm not sure what you mean by "everyday tag soup HTML" as distinct from
HTML5. HTML5, as proposed, is merely the next version of "everyday tag
soup HTML" (except that it would no longer be "tag soup" since the
processing model would be well-defined and interoperable across browsers).

Roger B. Sidje

unread,
Oct 4, 2006, 9:04:16 PM10/4/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org, Ian Hickson
On 5/10/2006 10:31 AM, Paul Topping wrote:

> So do you expect browsers like Mozilla and IE to accept HTML5 and handle

> it as defined by your spec? I'm going to guess that your answer is yes
> for Mozilla but no or "not my problem" for IE. If so, won't Mozilla have


> to have 3 parsers and renderers, one for tag-soup HTML, one for XHTML,

> and one for HTML5? Perhaps XHTML and HTML5 can share parsers and perhaps
> HTML5 can be rendered by Gecko as is, thereby reducing the number of
> pieces somewhat.
>

> Regardless of whether the above is right or wrong, it sounds like you
> are saying that adding MathML or <math> to HTML5 is not going to give us

> MathML support in everyday tag soup HTML. At the same time, I hear that
> Roger Sidje is talking about enhancing Mozilla so that his MathML
> renderer will work in everyday tag soup HTML. So perhaps you guys are
> talking about two completely separate things in this thread. Do I have
> it right?

To sum up, we are all discussing on what to agree upon. (Otherwise it
wouldn't be that much a discussion.)

I could make the renderer to work in either/all the three cases. But
what is practically of interest is that HTML5 gives us MathML support
everywhere (with today's browers), and this is where what IE+MathPlayer
comes in the picture. I could then focus on what has been agreed.
---
RBS

>
> Paul
>
>
>>-----Original Message-----
>>From: Ian Hickson [mailto:i...@hixie.ch]
>>Sent: Wednesday, October 04, 2006 4:03 PM
>>To: Paul Topping
>>Cc: www-...@w3.org; dev-tec...@lists.mozilla.org
>>Subject: RE: MathML-in-HTML5
>>

>>On Wed, 4 Oct 2006, Paul Topping wrote:
>>

Paul Topping

unread,
Oct 4, 2006, 9:07:54 PM10/4/06
to Ian Hickson, www-...@w3.org, dev-tec...@lists.mozilla.org
How can HTML5 be the next version of "everyday tag soup HTML" without
actually accepting tag soup documents?

Once HTML5 support is built into Mozilla, can I modify my tag soup HTML
page (ie, one with soupy tags that requires de facto HTML browser
behavior) by simply adding new stuff to make use of HTML5-only features?
Or do I have to also correct my soupy tags?

Paul

> -----Original Message-----
> From: Ian Hickson [mailto:i...@hixie.ch]
> Sent: Wednesday, October 04, 2006 5:45 PM
> To: Paul Topping
> Cc: www-...@w3.org; dev-tec...@lists.mozilla.org
> Subject: RE: MathML-in-HTML5
>
> On Wed, 4 Oct 2006, Paul Topping wrote:
> >
> > So do you expect browsers like Mozilla and IE to accept
> HTML5 and handle
> > it as defined by your spec?
>

> In due course. It won't happen overnight by any means; indeed
> the spec is
> far from ready and there are several major known issues with
> the spec as
> currently written.
>
>

> > I'm going to guess that your answer is yes for Mozilla but
> no or "not my
> > problem" for IE.
>

> If IE doesn't get on board, the whole affair is rather a
> waste of time.
> The browser with the largest chunk of market share is our
> biggest problem.
>
>

> > If so, won't Mozilla have to have 3 parsers and renderers, one for
> > tag-soup HTML, one for XHTML, and one for HTML5?
>

> The "tag soup HTML" parser is the HTML5 parser (or vice
> versa, depending
> on how you look at it). The "XHTML" parser is just the XML parser.
>
>

> > Regardless of whether the above is right or wrong, it
> sounds like you
> > are saying that adding MathML or <math> to HTML5 is not
> going to give us
> > MathML support in everyday tag soup HTML.
>

> I'm not sure what you mean by "everyday tag soup HTML" as
> distinct from
> HTML5. HTML5, as proposed, is merely the next version of
> "everyday tag
> soup HTML" (except that it would no longer be "tag soup" since the
> processing model would be well-defined and interoperable
> across browsers).
>

Ian Hickson

unread,
Oct 4, 2006, 9:25:27 PM10/4/06
to Paul Topping, www-...@w3.org, dev-tec...@lists.mozilla.org
On Wed, 4 Oct 2006, Paul Topping wrote:
>
> How can HTML5 be the next version of "everyday tag soup HTML" without
> actually accepting tag soup documents?

It does "accept tag soup documents". It fully defines the parsing model
for HTML, whether correct or incorrect, in a way compatible with how
browsers handle incorrect ("tag soup") documents today. By thus defining
how to handle "tag soup", it makes the entire concept redundant. ("Tag
soup" is so called because it implies that browsers don't know how to
handle it and so all do their own thing (which is indeed the case today,
but will no longer be the case if HTML5 is implemented per spec).


> Once HTML5 support is built into Mozilla, can I modify my tag soup HTML
> page (ie, one with soupy tags that requires de facto HTML browser
> behavior) by simply adding new stuff to make use of HTML5-only features?

Yes. Such documents would be as non-conformant as they are today, but they
would work, just like today (with the new features).

Juan R.

unread,
Oct 5, 2006, 4:56:38 AM10/5/06
to
> Ian Hickson said:

>
> On Wed, 4 Oct 2006 juanrgo...@canonicalscience.com wrote:
>>>
>>> IMHO backwards compatibility is not just "desirable", it is
>>> absolutely and fundamentally critical. Specification authors should
>>> always make their specs backward and forward looking.
>>
>> Then why all this stuff? Since you are proposing (ups i forgot that
>> you are not proposing anything even if look that) is not backward
>> compatible with current status of MathML authoring and processing.
>
> The idea of introducing math markup to text/html documents is
> completely orthogonal from the existence of MathML markup in XML
> documents. Compatibility with current MathML authoring and processing
> is neither here nor there -- the proposal here doesn't affect it in
> any way.
>
>
>> You can do [various things that have been at one time or another
>> suggested] but do not call it ***mathml***
>
> Ok.
>

Ok, let me call it 'MathTml'.

I am more pragmatic still and do not worry about different alternative
proposals trying ot obtain extra functionality in well-defined fields.
Therein, I am not against others HTML5 initiatives including <canvas>
or
Web Forms.

Since we already discussed about alternative mathematics in HTML5 I see
not problem also here. My current criticism was focused on the
implementation of a so called 'MathML' was not MathML really. But
since
you are really focused on

> The idea of introducing math markup to text/html documents is
> completely orthogonal from the existence of MathML markup in XML
> documents.

Then most of my previous criticism vanishes. For instance, I see no
problem on you doing a statistical analysis of the usage of MathML
entities for implementation of a subset of them, since you are _not_
doing mathml. It would be not a surprise for me if you find zero
frecuency for many entities and then decides to avoid supporting them.

If this new approach obtains success and many mathematicians begin to
publish math in your html 5 (you asure oftens you obtained this class
of
feedback) then you can wait conversors from/back CanonML.

Somewhat as I already said to Brian Jones -a program manager in Office-
that conversion between CanonML and the new 2007 format for mathematics
(OMML) is one of my priorities (several scientific comunities submit
papers in Word format today). In fact, even PRL (and other APS journal)
now begin to admit submissions in Word format.

Somewhat as I said here that if next MathML 3 is good enough and
obtains
a minimal popularity between our people we will develop conversors
also.

Somewhat as XML-MAIDEN approach is a crucial piece for next -non
experimental- Center's Website. Visitors can read mathematical
formulae
without plugins, special fonts or without a Mozilla browser thanks to
CSS techniques. I think that scientific and educative material would be
accesible to anyone and I am really tired of "you need IE+plugin or
FF+fonts for seeing this site".

Once remarked this, I have some queries for your proposal.

Next MathML

<mrow><mi>a</mi><mo>+</mo><mn>2</mn></mrow>

is next mathtml in HTML5

<mrow>a + 2</mrow>

but

<mfrac><mi>b</mi><mn>5</mn></mfrac>

would be (tecnically inefficient in Mozilla)

<mfrac><mrow>b</mrow><mrow>5</mrow></mfrac>

Why not adding num and den

<mfrac><num>b</num><den>5</den></mfrac>

had been 'aproved' at the WHATWG list (even a Mozilla developer agreed
was not a bad idea)?

Moreover, is the tokenizer spaces-based just like some EXSLT functions
or XSLT templates of str:tokenize()? Is next valid?

<mrow>0.269736842105263157894<mover accent='true'>736842105263157894
&#...</mover></mrow>

or may be

<mrow>0.269736842105263157894 <mover accent='true'>736842105263157894
&#...</mover></mrow>?


Ian Hickson said:
>
> There are literally tens of millions of
> pages, for instance, that use the "xmlns" attribute on the <a>
> element,

What is the problem with using the xmlns attribute on <a> or <html:a>
elements in a XML approach?

Chris Chiasson

unread,
Oct 5, 2006, 12:30:09 PM10/5/06
to

Juan R. wrote:
> I think that scientific and educative material would be
> accesible to anyone and I am really tired of "you need IE+plugin or
> FF+fonts for seeing this site".

Firefox should offer to go to the installation site and initiate the
download for the fonts. After the download + checksum or pgp
verification, the install should be executed. The user should be
prompted to agree to the license. The fonts should be installed. The
page should be reloaded with the new fonts active.

This process should be extended to the Styx fonts when they are
available.

> Once remarked this, I have some queries for your proposal.
>
> Next MathML
>
> <mrow><mi>a</mi><mo>+</mo><mn>2</mn></mrow>
>
> is next mathtml in HTML5
>
> <mrow>a + 2</mrow>
>

If "optimizations" like this are going to be created, a new form of
MathML should be created to handle it. Code should be written that can
transform it to the presentation version automatically. Perhaps it
could be called input MathML.

Juan R.

unread,
Oct 6, 2006, 4:45:00 AM10/6/06
to
Chris Chiasson wrote:
> Juan R. wrote:
> > I think that scientific and educative material would be
> > accesible to anyone and I am really tired of "you need IE+plugin or
> > FF+fonts for seeing this site".
>
> Firefox should offer to go to the installation site and initiate the
> download for the fonts. After the download + checksum or pgp
> verification, the install should be executed. The user should be
> prompted to agree to the license. The fonts should be installed. The
> page should be reloaded with the new fonts active.

And if do not agree with the license? Cannot see the math?

Moreover, the TeX fonts were designed for high-quality printing devices
not for screen, but <irony>Mozilla prints math with low quality and
people is returning to TeX engines for printing xml docs</irony>.

Morever forcing math users to use a type of font is as forcing text
users to write in Times New Roman.

> This process should be extended to the Styx fonts when they are
> available.

But since Mozilla engine is polluted with TeX metrics, when new fonts
were ready, Mozilla will not exploit them without a significant
rewritting of the math engine. Users would be prompted to download a
new version of the browser.

> > Once remarked this, I have some queries for your proposal.
> >
> > Next MathML
> >
> > <mrow><mi>a</mi><mo>+</mo><mn>2</mn></mrow>
> >
> > is next mathtml in HTML5
> >
> > <mrow>a + 2</mrow>
> >
>
> If "optimizations" like this are going to be created, a new form of
> MathML should be created to handle it. Code should be written that can
> transform it to the presentation version automatically. Perhaps it
> could be called input MathML.

I presented a similar approach to the MathML list where mi-mo-mn were
ignored and notation for fractions, roots, sub and superindices... was
simplified for authoring.

[http://lists.w3.org/Archives/Public/www-math/2006Mar/0027.html]

It was radically rejected at the list. See for instance Carlisle's
criticism to the mixture of text and tags

[http://lists.w3.org/Archives/Public/www-math/2006Mar/0028.html]

<blockquote>
I honestly see no benefit in
having some of the expression marked up as XML and some not.
</blockquote>

Moreover, some informal statistics and feedback confirm me that people
wannot write

<mrow>a + 2</mrow>

but TeX {a+2} or ASCIIMath (a+2)

The CanonMath program was abandoned and substituted by CanonFormal.
CanonFormal is not based in XML but in new CanonML language in
development [http://www.pault.com/xmlalternatives.html].

Above is [a + 2] in CanonML.

After some research, now i think that use of XML for mathematics is a
misuse of the concept of markup language. XML becomes from
documentation not from data and is fine when ratio markup/text is close
to zero (documentation usually are large chunks of text marked as
<para> <section> <title>...)

Something as this MathML

<mml:math>
<mml:apply><mml:plus/>
<mml:ci>a</mml:ci>
<mml:cn>2</mml:cn>
</mml:apply>
</mml:math>

is an clear abuse of the concept of markup language.

At the other side of the spectrum, LISP was designed for data no
documents and the reason that documentation systems based in LISP (e.g.
Scribe) never were popular even between LISP people! It is not strange
that the S-expression syntax was rejected by SGML folks in favor of the
tag angle-bracket syntax.

0 new messages