I am currently driving an effort to enable MathML-in-HTML (apart from MathML-in-XHTML that we already support). I have a patch that serves the dual purpose of showing where things are going and the issues to ponder about.
At the Firefox engineering meeting in Mountain Views (last December 2005), I pleaded that we enable MathML in HTML5 to advance the cause of MathML, which is so far locked in a XHTML/XML world that does not seem to be going anywhere in terms of display content as opposed to data (witness the WHATWG effort -- http://www.whatwg.org). Those to whom I spoke included dbaron, hixie and sicking, and they welcomed the suggestion, asking for a broader discussion. Hixie raised the caveat that MathML elements should still remain in the MathML namespace. He e-mailed me a while ago about a discussion on this matter in the WHATWG mailing list, which can be seen here http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-June....
That discussion is however too broad and involves tangential issues such as inventing another syntax, etc. My original take was simply to enable MathML+HTML, in the same vein as we have MathML+XHTML. I think MathML is suffering from having to fight the battle for adoption of XHTML as well. As a niche technology, it does not have the means to be engaging a fight. What it simply needs is MathML-in-HTML. W3C failed to recognise that it could retrofit MathML in HTML -- see this archived post for some insight: http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d5... But HTML5 being shepherded by WHATWG could provide the right framework from this to happen now.
I have finally been able to code this up (while keeping MathML elements in the MathML namespace). I attached the patch I had so far in bug 353926.
We support MathML-in-HTML5 when these two conditions are met:
1. The DOCTYPE of the document says so. If yes, we enable MathML entities (TODO) and flag mMayHaveMathML in the HTML content sink.
2. And either a) OR b) is met:
a) <html> has the MathML namespace as the value of an attribute with a prefix, e.g., <html xmlns:m="http://www.w3.org/1998/Math/MathML">.
In this case, we cache the prefix "m" in mMathMLNameSpacePrefix, and we intercept all <m:tag> in the document and create MathML content nodes for them.
In this case, we intercept all non-HTML elements inside the <math> tag and create MathML content nodes for them.
Issues: 1. Tag soup: we understand that we are exposing ourselves to this.
2. a) What about CSS matching rules? From the Style System point of view, the document is still HTML, but <m:math> is in the MathML namespace. We might have to special case MathML-in-HTML5 in the Style System as well.
b) The second option raises an issue with HTML-in-MathML, e.g., <math xmlns="http://www.w3.org/1998/Math/MathML"> <b>bold</b> </math> We don't intercept the <b> in this case. Hence, even though it is HTML-in-MathML without an explicit XHTML namespace for <b>, the HTML sink will give <b> a HTML content node. This is not really XHTML friendly. On the other hand, we don't want to be an XML parser either... These are conflicting objectives. We need to decide what to do. We may agree to only support tags with prefixes as in a), or also keep b) knowing that it has this XHTML unfriendly behavior. --- RBS
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
> Hixie raised the caveat that MathML elements should still remain in the > MathML namespace.
I meant in the DOM, I didn't mean in the markup. I don't think we should have any namespace declarations or namespace prefixes in text/html; I would just have the HTML parser always support the MathML elements, in the same way that it supports any random unknown element today, except that when it sees a MathML element it puts it into the MathML namespace in the DOM rather than the XHTML namespace.
I really don't think we want to introduce namespace prefixes or namespace declarations into tag soup. I think that would be a big mistake.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
If MathML is considered a subset of HTML5, then no namespace declaration would be necessary. However, if MathML is going to work in HTML that isn't declared as HTML5 (not clear to me from this thread), then the document would be poorly specified without it, IMHO.
At the risk of enciting an anti-Microsoft backlash, I should remind some on the list that IE has covered this territory before. They already have a mechanism for declaring XML islands in HTML that seems to work just fine. Of course, Mozilla won't be interested in duplicating IE's way of associating a plugin as the renderer of the namespace in the document. IMHO, it doesn't belong there anyway. It is better (ie, more secure) to keep such associations out of the content.
> > Hixie raised the caveat that MathML elements should still remain in > > the MathML namespace.
> I meant in the DOM, I didn't mean in the markup. I don't > think we should have any namespace declarations or namespace > prefixes in text/html; I would just have the HTML parser > always support the MathML elements, in the same way that it > supports any random unknown element today, except that when > it sees a MathML element it puts it into the MathML namespace > in the DOM rather than the XHTML namespace.
> I really don't think we want to introduce namespace prefixes > or namespace declarations into tag soup. I think that would > be a big mistake.
> -- > Ian Hickson U+1047E > )\._.,--....,'``. fL > http://ln.hixie.ch/ U+263A /, _.. \ > _\ ;`._ ,. > Things that are impossible just take longer. > `._.-(,_..'--(,_..'`-.;.' > _______________________________________________ > dev-tech-mathml mailing list > dev-tech-mat...@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-tech-mathml
> If MathML is considered a subset of HTML5, then no namespace declaration > would be necessary. However, if MathML is going to work in HTML that > isn't declared as HTML5 (not clear to me from this thread), then the > document would be poorly specified without it, IMHO.
As far as HTML5 UAs are concerned, declaring HTML as HTML5 consists of labelling it as text/html. It isn't clear to me what you would consider HTML that isn't declared as HTML5. With the exception of quirks which are required for compatibility with de facto standards that disagree with de jure standards, HTML has no practical versioning story -- all features work in all documents, regardless of the official "version" of HTML used.
> At the risk of enciting an anti-Microsoft backlash, I should remind some > on the list that IE has covered this territory before. They already have > a mechanism for declaring XML islands in HTML that seems to work just > fine.
XML data islands don't form part of the parent DOM (they are "islands", as opposed to part of the document). I'm not sure how wrapping <xml> tags around the MathML content would help. :-)
> And, I should have added that without a namespace declaration there > would be no way to differentiate different versions of MathML. While > most MathML instances are now MathML 2.0, the MathML 3.0 effort is just > now starting up.
Why would you need to distinguish them? MathML2 is a superset of MathML1, and (for all intents and purposes) any compliant MathML2 UA can process any compliant MathML1 content. I would assume that this would continue to be the case; if not, then this is IMHO a problem with MathML3.
Note that the namespace declaration can't currently distinguish between MathML1 and MathML2, I don't see any reason why MathML3 would change this.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
I don't understand. Aren't people who are savvy enough to generate MathML also savvy enough to generate XHTML? Has anyone actually said, "That MathML I can handle, but what's this XHTML?"
r...@maths.uq.edu.au wrote: > I am currently driving an effort to enable MathML-in-HTML (apart from > MathML-in-XHTML that we already support). I have a patch that serves > the dual purpose of showing where things are going and the issues to > ponder about.
> At the Firefox engineering meeting in Mountain Views (last December > 2005), I pleaded that we enable MathML in HTML5 to advance the cause > of MathML, which is so far locked in a XHTML/XML world that does not > seem to be going anywhere in terms of display content as opposed to > data (witness the WHATWG effort -- http://www.whatwg.org). Those to > whom I spoke included dbaron, hixie and sicking, and they welcomed the > suggestion, asking for a broader discussion. Hixie raised the caveat > that MathML elements should still remain in the MathML namespace. He > e-mailed me a while ago about a discussion on this matter in the > WHATWG mailing list, which can be seen here > http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2006-June....
> That discussion is however too broad and involves tangential issues such as > inventing another syntax, etc. My original take was simply to enable > MathML+HTML, in the same vein as we have MathML+XHTML. I think MathML > is suffering from having to fight the battle for adoption of XHTML as > well. As a niche technology, it does not have the means to be engaging > a fight. What it simply needs is MathML-in-HTML. W3C failed to > recognise that it could retrofit MathML in HTML -- see this archived > post for some insight: > http://groups.google.com/group/netscape.public.mozilla.mathml/msg/4d5... > But HTML5 being shepherded by WHATWG could provide the right framework > from this to happen now.
> I have finally been able to code this up (while keeping MathML > elements in the MathML namespace). I attached the patch I had so far > in bug 353926.
> We support MathML-in-HTML5 when these two conditions are met:
> 1. The DOCTYPE of the document says so. If yes, we enable > MathML entities (TODO) and flag mMayHaveMathML in the HTML content sink.
> 2. And either a) OR b) is met:
> a) <html> has the MathML namespace as the value of an attribute with a > prefix, e.g., <html xmlns:m="http://www.w3.org/1998/Math/MathML">.
> In this case, we cache the prefix "m" in mMathMLNameSpacePrefix, > and we intercept all <m:tag> in the document and create > MathML content nodes for them.
> In this case, we intercept all non-HTML elements inside the <math> tag > and create MathML content nodes for them.
> Issues: > 1. Tag soup: we understand that we are exposing ourselves to this.
> 2. a) What about CSS matching rules? From the Style System point of view, > the document is still HTML, but <m:math> is in the MathML namespace. We > might have to special case MathML-in-HTML5 in the Style System as well.
> b) The second option raises an issue with HTML-in-MathML, e.g., > <math xmlns="http://www.w3.org/1998/Math/MathML"> > <b>bold</b> > </math> > We don't intercept the <b> in this case. Hence, even though it is > HTML-in-MathML without an explicit XHTML namespace for <b>, > the HTML sink > will give <b> a HTML content node. This is not really XHTML friendly. > On the other hand, we don't want to be an XML parser either... These > are conflicting objectives. We need to decide what to do. We may agree > to only support tags with prefixes as in a), or also keep b) knowing > that it has this XHTML unfriendly behavior. > --- > RBS
> ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program.
> I don't understand. Aren't people who are savvy enough to generate > MathML also savvy enough to generate XHTML? Has anyone actually said, > "That MathML I can handle, but what's this XHTML?"
Savvy, yes. But also impatient. You will notice that HTML is what gets used -- not XHTML.
I like write straight HTML with embedded LaTeX, then run it through a translator to turn $exponents^2$ into MathML. Sure, HTML->XHTML converters exist, but again, I'm lazy, selfish, and impatient.
sha...@shantirao.com wrote: > On 9/24/2006 1:58 AM, Chris Chiasson wrote: > > I don't understand. Aren't people who are savvy enough to generate > > MathML also savvy enough to generate XHTML? Has anyone actually said, > > "That MathML I can handle, but what's this XHTML?"
> Savvy, yes. But also impatient. You will notice that HTML is what gets > used -- not XHTML.
> I like write straight HTML with embedded LaTeX, then run it through a > translator to turn $exponents^2$ into MathML. Sure, HTML->XHTML > converters exist, but again, I'm lazy, selfish, and impatient.
> XML data islands don't form part of the parent DOM (they are "islands", as > opposed to part of the document). I'm not sure how wrapping <xml> tags > around the MathML content would help. :-)
The syntax Paul was referring to here wasn't the <xml> convention, but the ability in IE to have (explicitly prefixed) XML elements within an HTML document with rendering controlled by an external component, but _without_ any other flag at that point in the in the markup, such as <xml> or <object> etc.
In the IE implementation you need to have an <object> in the head pointing at the particular rendering component, which is fairly horrible and also, you need to declare the namespace using (a variant of) an early working draft namespace syntax using a PI, but as Paul said, those parts needn't be copied. an example of a document using this syntax is shown here:
By using a different classid you can do the same thing to include (explicitly prefixed) svg into an htm document and have it rendered by Adobe's svg viewer, and in principle any other vocabularies (although I don't personally know of any other implementations of this, except techexplorer, which is again for MathML).
I'm not sure, having math more or less added directly to html would be nice in many ways but I'm not sure how well it scales, if you think people might want to have html+svg+chemml+... then perhaps having an api that allows processing to be attached to namespaced elements would be more general. On the other hand that was part of the reason for having namespaces (and for that matter, xml itself) that people could serve all sorts of different xml vocabularies and have clients do whatever is necessary. I suspect part of the reason for "html5" is a feeling that that never happened and isn't going to be mainstream any time soon, and that a solution that directly addresses the fixed html vocabulary, with perhaps two specific extensions such as svg and mathml will in practice cover the vast majority of browser needs, and other vocabularies can be transformed to html+.. before being served.
> The syntax Paul was referring to here wasn't the <xml> convention, but > the ability in IE to have (explicitly prefixed) XML elements within an > HTML document with rendering controlled by an external component, but > _without_ any other flag at that point in the in the markup, such as > <xml> or <object> etc.
Oh, well, as noted earlier, the idea of namespace prefixes in HTML isn't one that I personally am particularly fond of.
> I suspect part of the reason for "html5" is a feeling that that never > happened and isn't going to be mainstream any time soon, and that a > solution that directly addresses the fixed html vocabulary, with perhaps > two specific extensions such as svg and mathml will in practice cover > the vast majority of browser needs, and other vocabularies can be > transformed to html+.. before being served.
I think that's pretty much exactly correct, yes.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
>>>I don't understand. Aren't people who are savvy enough to generate >>>MathML also savvy enough to generate XHTML? Has anyone actually said, >>>"That MathML I can handle, but what's this XHTML?"
>>Savvy, yes. But also impatient. You will notice that HTML is what gets >>used -- not XHTML.
>>I like write straight HTML with embedded LaTeX, then run it through a >>translator to turn $exponents^2$ into MathML. Sure, HTML->XHTML >>converters exist, but again, I'm lazy, selfish, and impatient.
sha...@shantirao.com wrote: > On 9/24/2006 10:43 AM, Chris Chiasson wrote: > > You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX? > > What kind of tools are you using?
> A text editor, and itexMML, of course! Sure, more sophisticated tools > exist, but they aren't very reliable, are they?
sha...@shantirao.com writes: > On 9/24/2006 10:43 AM, Chris Chiasson wrote: >> You find it easier to transform HTML + LaTeX instead of XHTML + LaTeX? >> What kind of tools are you using?
> A text editor, and itexMML, of course! Sure, more sophisticated tools > exist, but they aren't very reliable, are they?
Oh?
I expect that the author of whatever more sophisticated tool you try would like to hear of any lack of reliability you find.
> What I would be proposing for HTML5 is just the following list of > elements: > > math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom, > mfenced, menclose, msub, msup, msubsup, munder, mover, munderover, > mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction
I don't like mlabeledtr very much (I have already expressed my views about it to folks of the MathML WG), and would hope that they will take my suggestion for <mtr label="..."> in MathML3. The former is unnecessarily bloated and doesn't degrade gracefully at all with renderers that don't support it (not to mention that it is hard to fit in Gecko's existing table code).
However, your list misses some key tags, in particular leaf tags such as <mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and <none/> are needed in <mmultiscripts> (albeit it can be argued that <none/> is the same as <mrow></mrow> or an empty <mspace/>, but the differentiation is worthwhile).
In general, I would prefer the list to at least include all the tags that we already support, and which existing webpages have come to depend on. This effectively boils down to your list above, excluding <mlabeledtr>, and including <mspace/>, <mprescripts/>, <none/> and <mi>, <mn>, <ms>, <mtext>, <mo>. In particular, <mo> is a vital tag as it is at the heart of those stretchy MathML characters.
Implementation-wise, as this inclusion of MathML-in-HTML5 marks the beginning of tag soup, it may be that the HTML parser would have to have some knowledge of leaf tags, so that for example, a stray <mspace> doesn't become the root of an entire HTML tree... which is later fed to the hapless MathML engine. (The patch I attached in bug 353926 ignored the issue.) --- RBS
>>>We didn't check that <canvas> wouldn't cause clashes, either.
>>I see. I had assumed that we in fact had.
>>>I don't see why. We don't want a flag for when people can use the storage >>>APIs. Or when they can use <img> elements. Or whatever.
>>True, because those are very unlikely to collide with random stuff the pages >>are doing (e.g. the storage APIs are using fairly long names that are unlikely >>to collide with page-defined functions and variables).
>>If we think MathML has a similarly low risk of collision, great.
> I don't know about "we".
> What I would be proposing for HTML5 is just the following list of > elements:
> ...and of those only <math> came up at in the top 1000 elements in my > search of elements on about one billion pages.
> According to that same research, <math> is, on the Web, less frequent than > the following elements: <m>, <e>, <rem>, <tab>, <yr>, <prohibits>, <your>, > <lable>, <text-spez>, etc. It was present on less than 0.002% of the pages > the research covered. (To give an idea of scale, <h8> is used on more than > 0.003%, so if we avoid <math> because of this, we should probably > introduce <h7> and <h8> into HTML, since we're saying that's an important > enough level to worry about.)
> Now, of course, it could be that those 0.002% of pages are all hugely > important and that we'll break the Web in adding this feature. We can't > know until we've tried.
> I don't like mlabeledtr very much (I have already expressed my views > about it to folks of the MathML WG), and would hope that they will take > my suggestion for <mtr label="..."> in MathML3. The former is > unnecessarily bloated and doesn't degrade gracefully at all with > renderers that don't support it (not to mention that it is hard to fit > in Gecko's existing table code).
I'm happy to drop/add any tag to this list. Just give me the list you want.
> However, your list misses some key tags, in particular leaf tags such as > <mspace/> -- which is sometimes quite useful. Also, <mprescripts/> and > <none/> are needed in <mmultiscripts> (albeit it can be argued that > <none/> is the same as <mrow></mrow> or an empty <mspace/>, but the > differentiation is worthwhile).
I missed anything that wasn't in the table I happened upon in the spec. I didn't look very closely for the exact table I wanted.
Tell me what tags you want to have and we'll make that the list. You're the expert. :-)
> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the > beginning of tag soup, it may be that the HTML parser would have to have > some knowledge of leaf tags, so that for example, a stray <mspace> > doesn't become the root of an entire HTML tree... which is later fed to > the hapless MathML engine. (The patch I attached in bug 353926 ignored > the issue.)
Don't worry, these tags auto-close when a parent tag is closed.
<foo><bar><baz></foo><quux>
...results in this DOM:
<foo> <bar> <baz> <quux>
For leaf nodes with following siblings, people will have to use end tags, as in:
<foo><bar></bar><baz></baz></foo><quux></quux>
If we want to start adding actual leaf tags, I'd rather do this in a second stage, after we have a proof of concept. (I've so far avoided adding any new tags to the HTML5 parser spec, but eventually there will be a bunch we have to add.)
We can go from non-empty to empty much more easily than from empty to non-empty.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
> I'm happy to drop/add any tag to this list. Just give me the list you > want.
OK.
> For leaf nodes with following siblings, people will have to use end tags, > as in:
> <foo><bar></bar><baz></baz></foo><quux></quux>
> If we want to start adding actual leaf tags, I'd rather do this in a > second stage, after we have a proof of concept. (I've so far avoided > adding any new tags to the HTML5 parser spec, but eventually there will be > a bunch we have to add.)
OK, I see.
The other issue are those 2000 entities that MathML has. You said that you are not a big fan of a namespace thingy on the root <html> element.
Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all W3C entities _by default_? We have a proof-of-concept of that in View Selection Source, BTW. It will display any entity it can. http://lxr.mozilla.org/mozilla/source/content/base/public/nsIDocument... As VSS has underwent the test of time without major complaints, perhaps <!DOCTYPE html> could assume that too? If that is agreed, we are all clear.
The other remaining issue might be with style matching because <math> will then be internally in the MathML namespace whereas the HTML document is in the none namespace (at present), but we will see how it goes from there. --- RBS
> The other issue are those 2000 entities that MathML has.
Yeah... Do we really need those? Some of them seem reasonable to add, but 2000 seems like too many for the mnemonic advantage to beat just using Unicode codepoints...
The problem with adding entities is that a LOT of people do things like
href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
...which today works, but would break if MathML entities were introduced (since &ac is a MathML entity).
> Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all > W3C entities _by default_?
Don't do anything based on the DOCTYPE. HTML5 is anything sent as text/html.
> The other remaining issue might be with style matching because <math> > will then be internally in the MathML namespace whereas the HTML > document is in the none namespace (at present), but we will see how it > goes from there.
I don't see why this would cause any problems.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
> The problem with adding entities is that a LOT of people do things like
> href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
> ...which today works, but would break if MathML entities were introduced > (since &ac is a MathML entity).
That list is so big that trying to hand-pick some and leaving some out would need another committee...
>>Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting all >>W3C entities _by default_?
> Don't do anything based on the DOCTYPE. HTML5 is anything sent as > text/html.
I thought the DOCTYPE was trustworthy -- based on this excerpt from the HTML5 spec:
"HTML documents that use the new features described in this specification must start with the string <!DOCTYPE html> and, if they are served over the wire (e.g. by HTTP) must be labelled with the text/html MIME type."
If so, it would have meant less conflicts with agreed entities in HTML5.
BTW, for my own information, do you intent HTML5 to be transitional, almost-standards, or strict? If it is HTML5 (or XHTML5) served as text/html but put in the XHTML namespace at some later stage (as the HTML5 implies), it better be strict, no? And that would be driven by the DOCTYPE detection code. Catch my drift? Or is tag soup going to be in the XHTML namespace?
If it is strict then maybe entities could be required to have a semi-colon -- which will then avoid the ambiguities you mentioned above.
Not that I have a position on this (at least as yet). I am just bringing in some food for thoughts, to accommodate the realistic issues of MathML. --- RBS
On Wed, 27 Sep 2006, Roger B. Sidje wrote: > On 27/09/2006 11:23 AM, Ian Hickson wrote:
> > The problem with adding entities is that a LOT of people do things > > like
> > href="/u?aa=foo&ab=foo&ac=foo&ad=foo"
> > ...which today works, but would break if MathML entities were > > introduced (since &ac is a MathML entity).
> That list is so big that trying to hand-pick some and leaving some out > would need another committee...
Not really... I say we just add ApplyFunction, InvisibleComma, and InvisibleTimes (but not their short aliases).
> > > Is is okay to assume HTML5 (with its <!DOCTYPE html>) as supporting > > > all W3C entities _by default_?
> > Don't do anything based on the DOCTYPE. HTML5 is anything sent as > > text/html.
> I thought the DOCTYPE was trustworthy -- based on this excerpt from the > HTML5 spec:
> "HTML documents that use the new features described in this > specification must start with the string <!DOCTYPE html> and, if they > are served over the wire (e.g. by HTTP) must be labelled with the > text/html MIME type."
That's an authoring conformance requirement, and has no bearing on implementations.
> BTW, for my own information, do you intent HTML5 to be transitional, > almost-standards, or strict?
HTML5 documents starting with <!DOCTYPE HTML> must be in standards mode. Documents with other DOCTYPEs or no DOCTYPE at all may be in another mode, as already described in the spec. In due course I may specify quirks mode and then there'll just be the spec, and no other modes.
> If it is HTML5 (or XHTML5) served as text/html but put in the XHTML > namespace at some later stage (as the HTML5 implies), it better be > strict, no? And that would be driven by the DOCTYPE detection code. > Catch my drift? Or is tag soup going to be in the XHTML namespace?
Not sure what you mean my that. All HTML DOM nodes are (per HTML5) in the XHTML namespace, irrespective of the standards/quirks thing.
> If it is strict then maybe entities could be required to have a > semi-colon -- which will then avoid the ambiguities you mentioned above.
That would break back-compat.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Ian Hickson <i...@hixie.ch> writes: > On Wed, 27 Sep 2006, Roger B. Sidje wrote: > . . . >> Implementation-wise, as this inclusion of MathML-in-HTML5 marks the >> beginning of tag soup, ...
> Don't worry, these tags auto-close when a parent tag is closed.
Two points for clarification:
1. There's the old issue, related to dual parsers, of trying to get Mozilla family user agents to give proper handling of XHTML+MathML when served through text/html -- following early Amaya practice. (In the end the W3C HTML WG refused to support this idea and spawned the mimetype application/xhtml+xml.) It seems that formally correct XHTML+MathML would now gain coverage as text/html under current WhatWG thinking, at least when XML namespaces are evident only through use of the xmlns attribute (which would be ignored in tag soup), i.e., no use of xml namespace prefixing. Is this correct?
2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers will generate MathML content that's good enough for Mozilla rendering?
---
In case you don't know:
The W3C Math group has announced that it is beginning to think seriously about author-level markup for math.
Long term -- say ten years in the future (we've already been at this for ten years) -- I think author level math additions to the tag soup vocabulary would work out much better, especially with enhanced CSS support.
Cheers.
-- Bill
---------------------------------------------------------------------- William F. Hammond Dept. of Mathematics & Statistics 518-442-4625 The University at Albany hammond At math.albany.edu Albany, NY 12222 (U.S.A.) http://www.albany.edu/~hammond/ Dept. FAX: 518-442-4731 ----------------------------------------------------------------------
I don't think I saw Ian's original comment, Just Roger's reply?
> What I would be proposing for HTML5 is just the following list of > elements: > > math, mrow, mfrac, msqrt, mroot, mstyle, merror, mpadded, mphantom, > mfenced, menclose, msub, msup, msubsup, munder, mover, munderover, > mmultiscripts, mtable, mlabeledtr, mtr, mtd, maction
You would beed to include the leaf elements (mi mn mo mtext) otherwise there'll be no characters in the mathml!, also mspace is pretty important.
But a more general point I think it's dangerous for a spec to be profiled by _implementations_. The Math WG activity has just been restarted at W3C and if there is a need to profile MathMl to presentation MathML (or a subset thereof) please can it be done _there_ so that there is some chance that mathml authoring tools can be customised to have options to generate code to match any profiled spec.
> I don't like mlabeledtr very much (I have already expressed my views > about it to folks of the MathML WG)
_Now_ would be a really good time to make such comments as we are in the process of finalising the requirements for what extar features should be in MathML3, and what if necessary, features should be deprecated.
I don't remember specific discussions about an <mtr label="..."> I would guess there woul dbe some convern about the label being an attribute rather than an element restricting the possibilities, but implementation advice on difficulties on teh current schem woul dbe taken seriously....
Ian wrote about entities
> Yeah... Do we really need those? Some of them seem reasonable to add, but > 2000 seems like too many for the mnemonic advantage to beat just using > Unicode codepoints...
I'd say that it's probably not worth including only a few, it would just lead to confusion. The problem is that much mathml is generated using tools and those tools may use entities, and if they do that the user hasn't much control over which are used, and how to fix things to remove entities that are not supported in the browser. It would be better to just get the MathML authoring tools to use characters or character refs directly and tell the user mathml entities are not supported (but html ones are)
> 1. There's the old issue, related to dual parsers, of trying to get > Mozilla family user agents to give proper handling of XHTML+MathML when > served through text/html -- following early Amaya practice. (In the end > the W3C HTML WG refused to support this idea and spawned the mimetype > application/xhtml+xml.) It seems that formally correct XHTML+MathML > would now gain coverage as text/html under current WhatWG thinking, at > least when XML namespaces are evident only through use of the xmlns > attribute (which would be ignored in tag soup), i.e., no use of xml > namespace prefixing. Is this correct?
I'm confused by your terminology.
MathML using namespaces and XML syntax would not, under the WHATWG proposals here, be formally correct. XML sent as text/html is never correct per the "WHATWG thinking".
What is being proposed here is a non-XML syntax, to be formally described in the HTML5 specification, which, went processed by an HTML5 UA, would generate a DOM that can then be processed per the MathML2 specification.
Per the WHATWG specifications, the presence of an "xmlns" attribute is always a conformance error in any content sent as text/html.
> 2. Is WhatWG entertaining the idea that off-the-cuff tag soup writers > will generate MathML content that's good enough for Mozilla rendering?
The idea being entertained is that off-the-cuff HTML5 authors, and HTML5 editors, would create content which, when processed by an HTML5 UA (such as Mozilla, in due course), would render as MathML markup would.
> The W3C Math group has announced that it is beginning to think seriously > about author-level markup for math.
> Long term -- say ten years in the future (we've already been at this for > ten years) -- I think author level math additions to the tag soup > vocabulary would work out much better, especially with enhanced CSS > support.
On the very short term, the proposal here is just a proof of concept. On the medium term (12 months) I was considering specifying more complex parsing rules for MathML such that the same MathML2-compatible DOM could be obtained from much smaller markup, e.g. by implying <mo> tags around operators and <mn> tags around numbers.
HTH, -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
> I don't remember specific discussions about an <mtr label="..."> I > would guess there woul dbe some convern about the label being an > attribute rather than an element restricting the possibilities, but > implementation advice on difficulties on teh current schem woul dbe > taken seriously....
It appeared that attributes (like those in <mfenced>) aren't unanimous either. But having a bloated tag that won't be implemented in the next several years isn't really helpful.
> Ian wrote about entities
>>Yeah... Do we really need those? Some of them seem reasonable to add, but >>2000 seems like too many for the mnemonic advantage to beat just using >>Unicode codepoints...
> I'd say that it's probably not worth including only a few, it would just > lead to confusion.
I am actually a fan of entities because they improve readability a fair bit. I hope Ian won't give up thinking on this issue so quickly... especially in the context of MathML where strange characters are quite common.
As to my suggestion that "if [a document] is strict then maybe entities could be required to have a semi-colon -- which will then avoid the ambiguities", to which Ian responded that, "That would break back-compat."
We have other cases of broken back-compat. -- where users were told to use a non-strict DOCTYPE or some other workaround, e.g, line-height of images. --- RBS
Roger, Thanks for the link on <mtr label="mylabel">,
> It appeared that attributes (like those in <mfenced>) aren't unanimous > either.
yes mfenced also "suffers" from requiring attributes, but probably one is more likely to need markup in an equation label than in a stretchy operator. It's not so uncommon to want superscript * or daggers etc to highlight special versions of formulae, and mfenced is explictly a shorthand form so you can always use the mwrow/mo form if you need an operator that is "decorated" in some way. That would not be the case here if mlabeledtr were deprecated and an attribute form was the only version. (Actually it would if the attribute could then be css-styled using css generated content. Allowing css (or other mechanism) auto numbering is I think a highly requested feature for mathml3.
> (not on www-math, though. Maybe I should forward it there?)
Yes please do. When we are doing a pass for errata or pulling in feature requests for a new version we can do a more or less exhaustive check of the official comment list but (even with google's help) doing an exhaustive check of the entire web's a bit hard:-)
Extension of MathML with enhanced support for equation labeling, including automatic numbering, general label placement and style, and resolution of references.
so getting that specified out in a way that ensures that implementations can implement it sounds like a good idea, and the timiming is good now to get new features in this area if that is needed. If WhatWG members are interested in mathml most of them are w3c members and could join the WG of course (currently only Opera is represented out of the main browser vendors) But WG membership isn't really needed we can do the technical discussion on the public www-math list if that is appropriate.
> I am actually a fan of entities because they improve readability a fair > bit.
Well as you know I've invested a frightening number of houres maintaining that entity set (and the draft iso set at www.w3.org/2003/entities, which is the same thing, really) so I'm also think they are valuable, although it's a kind of love-hate relationship most of the time:-)
> I hope Ian won't give up thinking on this issue so quickly... > especially in the context of MathML where strange characters are quite > common.
Yes I think the ideal situation is that they all be allowed. My comment was that subsetting them is likely to be more confusing than helpful.
> As to my suggestion that "if [a document] is strict then maybe entities > could be required to have a semi-colon -- which will then avoid the > ambiguities", to which Ian responded that, "That would break back-compat."
Requiring a ; would seem reasonable to me (ie make the lack of a ; make the & into an implict & rather than be an error as in xml). That does have a theoretical backward compatibility problem in that → would be an arrow instead of &rightarrow; but I would have thought that the occurrences of any such construction outside of test suites was rather rare.
I consider switching from XML to text/html as inappropriate and pointless development, morover it is damaging in long term perspective.
First of all it is unclear where this idea comes from, as MathML community has no legacy text/html content that one should care about. All MathML content is wellformed (by definition), which means that one has less errors in MathML documents comparing to what one would have in tagsoup approach, it also means that all MathML content can can be handled with XML tools, can be processed with XSLT, matched using XPath, mixed with other XML based markup languages (OpenMath, SVG) etc. There is no single MathML implementation that supports text/html tagsoup, but does not support X(HT)ML, while inverse is not true, there are XML only MathML implementations that by definition have nothing to do with HTML legacy.
Further it is not clear for me why this has to be done today, after paying price for wellformedness and tackling XML related problems for seven years, when finally MSIE/MathPlayer accept application/xhtml+xml and thus allow people to deliver the same XHTML+MathML to MSIE/MathPlayer and Mozilla (one can add Opera with UserJS) someone decides to revert (more precisely convert) everything to tagsoup.
Profiling policy is sounds unclear and strange to me. Solving issue on the level "I'm happy to drop/add any tag to this list. Just give me the list you want" or based on MathML support level on some particular implementation seems to be irresponsible. There are at least two subgroups in W3C Math WG that one could drop a message with profile proposal to after looking at "wrong table". One is called liason with WhatWG subgroup and as name suggests is expected to ensure that needs of MathML are addressed in WhatWG specs. Another is liason with CSS subgroup, which is expected to define MathML profile suitable for usage in XML+CSS framework and a few CSS extensions needed to format proposed MathML profile. There is also subgroup that deals with compound document formats. My opinion is that profiling of MathML should be coordinated with these units as irresponsible steps may spoil W3C efforts in the same area.
One more thing that sounds unlogical and rather strange is that Mozilla/WhatWG try to move MathML further from XML+CSS framework, by converting XML to tagsoup with ad hoc parsing rules and embracing constructions like mstyle, mpadded in "proposed" profile.
> > > Yeah... Do we really need those? Some of them seem reasonable to add, but > > > 2000 seems like too many for the mnemonic advantage to beat just using > > > Unicode codepoints...
> > I'd say that it's probably not worth including only a few, it would just > > lead to confusion.
> I am actually a fan of entities because they improve readability a fair > bit. I hope Ian won't give up thinking on this issue so quickly... > especially in the context of MathML where strange characters are quite > common.
I really don't want to start introducing weird rules for parsing entities (I'm trying to simplify the entity parsing rules, not make them worse). At least not at this stage. Maybe once we have a proof-of-concept working, it would make more sense to revisit the issue, but I'd want to do a thorough scan of the Web to see how common these entities actually are today.
> As to my suggestion that "if [a document] is strict then maybe entities > could be required to have a semi-colon -- which will then avoid the > ambiguities", to which Ian responded that, "That would break > back-compat."
> We have other cases of broken back-compat. -- where users were told to > use a non-strict DOCTYPE or some other workaround, e.g, line-height of > images.
Yeah. And we can see how well _that_ went. QA nightmare, multiple overlapping codepaths, obscure bugs, confused authors, contradicting documentation, etc. Let's not go there again. The whole point of MathML-in-HTML is to have back-compat work -- if we didn't care about back-compat, we would just have people use MathML-in-XHTML.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'