suggestion: use <span> instead of <div> for Mathjax_Display

932 views
Skip to first unread message

timt...@gmail.com

unread,
Apr 14, 2014, 9:16:55 PM4/14/14
to mathja...@googlegroups.com
Hey guys, I'm hoping to start a discussion on the implication of using <div> elements to delimit Display-style math for MathJax CSS/HTML.

In the course of designing CSS for extremely math-heavy online articles, I've come to realize that it's almost impossible to reconcile the widely-accepted math writing convention "display equations should still be considered part of a sentence" with the fact that it is wrapped in a <div>. This is because <div> tags are not, in fact, allowed inside <p> tags, as both are flow/block-level content.

This restriction has nothing to do with specific browser implementations. It's not even valid HTML. Here are the specs for the <p> tag. To pass validation, it is only allowed to contain "phrasing content":


What are phrasing contents? Basically, not block-level stuff, and certainly no <div>s:


So how does most browsers behave when confronted with a <div> inside a <p> tag? It actually closes the <p> for you right before the <div>.

This obviously causes problems if you are trying to produce semantically-relavant HTML that contains display-style math.

Moreover, the problems are not just restricted to semantics. It poses serious real-world problems in terms of layout:

1. Paragraph indents will be wrong if browsers turn <p><div class=Mathjax_Display></div></p> into <p></p><div class=Mathjax_Display></div><p></p>. You end up with two paragraphs where there should be one.

2. Modern CSS best practices prescribes implementing some sort of underlying line grid based on the content-text line height. As a consequence, it is extremely common for <p> tags to have zero top margin, and a bottom margin of the full line-height, to make it conform to "the grid". If <p> elements are always closed by the browser before a <div> tag, then we will always end up with a full line-height of extra whitespace immediately before a display equation. We can't even solve this problem with CSS selectors; there is, famously, no "predecessor" selector, so there's is in fact no way to select <p> blocks immediately before a "Mathjax_Display" element in order to disable its bottom margin.

The only way I can see this resolved in a pure HTML/CSS way without jQuery tricks is to use <span> tags instead of <div> for "Mathjax_Display" elements, and then set the "display: block" style (in fact, I believe this is already the case). Is there a reason why this is not currently done? I tried to look for previous discussions of this but I can't seem to find anything relevant.

ernest...@gmail.com

unread,
Apr 15, 2014, 12:17:28 AM4/15/14
to mathja...@googlegroups.com, timt...@gmail.com
I would agree in that display/block equations should be within paragraphs as well.

I wonder if there are any drawbacks...

William F Hammond

unread,
Apr 15, 2014, 12:53:45 AM4/15/14
to mathja...@googlegroups.com

On Mon, Apr 14, 2014 at 6:16 PM, <timt...@gmail.com> wrote:
What are phrasing contents? Basically, not block-level stuff, and certainly no <div>s:

In HTML 5 <math> is allowed in phrasing content as well as in flow.  So <math> is allowed directly in <p> whether its displayed or inline.

    -- Bill

ernest...@gmail.com

unread,
Apr 15, 2014, 1:07:12 AM4/15/14
to mathja...@googlegroups.com
However, the output generated from MathJax uses a <div> element which isn't allowed within the <p> element semantically: http://stackoverflow.com/questions/10763780/putting-div-inside-p-is-adding-an-extra-p

Good thing that <math> is good though.

David Carlisle

unread,
Apr 16, 2014, 11:08:22 AM4/16/14
to mathja...@googlegroups.com
On 15 April 2014 06:07, <ernest...@gmail.com> wrote:
However, the output generated from MathJax uses a <div> element which isn't allowed within the <p> element semantically: http://stackoverflow.com/questions/10763780/putting-div-inside-p-is-adding-an-extra-p

Good thing that <math> is good though.

On Tuesday, 15 April 2014 16:53:45 UTC+12, William F Hammond wrote:

On Mon, Apr 14, 2014 at 6:16 PM, <timt...@gmail.com> wrote:
What are phrasing contents? Basically, not block-level stuff, and certainly no <div>s:

In HTML 5 <math> is allowed in phrasing content as well as in flow.  So <math> is allowed directly in <p> whether its displayed or inline.






The content model of html p has always been basically irretrievably broken as far as semantically oriented paragraphs are concerned. It is not just math, lists should also be considered part of a paragraph but can not be in p.  If using p you have to just consider it to denote not a paragraph but a text block that forms part of a paragraph together with displayed lists and equations. Or (as we do here) simply use div with an appropriate class. In many ways when div was introduced to html it was welcomed as "p with a fixed content model".

David


William F Hammond

unread,
Apr 16, 2014, 2:43:47 PM4/16/14
to mathja...@googlegroups.com

On Wed, Apr 16, 2014 at 8:08 AM, David Carlisle <d.p.ca...@gmail.com> wrote:
If using p you have to just consider it to denote not a paragraph but a text block that forms part of a paragraph together with displayed lists and equations.

I agree that <p> is irretrievably broken.  There was a group in the xhtml crowd that wanted to fix it (but did they?).  My memory from the early days is that <div> was "marketed" as both for sections and utility blocks.  I believe that ISO HTML, which, as I recall, appeared to be spun from W3C HTML-2, did have sections.

The main problems I recall having with <p> are lists and tables (corresponding to unfloated tabular), but <math> in display mode has always been OK in <p>. In my experience most browsers that have rendered mathml have handled the displayed form within <p> correctly -- at least by providing appropriate newlines if not always by also centering.  Certainly it's good with native Firefox rendering, with MathJax, and with Fred Wang's CSS, and it was always good with pre-Firefox Mozilla and IE/MathPlayer.
Message has been deleted

timt...@gmail.com

unread,
Apr 17, 2014, 9:12:42 PM4/17/14
to mathja...@googlegroups.com, timt...@gmail.com
I entirely agree that <p> is a broken model, and pragmatically be replace with <div> whenever possible. Obviously though sometimes people's hands are tied if they cannot directly author the HTML, with the most ubiquitous case being the various Markdown renderers. The mighty <p> is a mostly undisputed part of the Markdown spec and appears in every existing implementation today.

In Markdown, lists and tables and such are explicitly block elements and cannot be part of a paragraph. On the other hand, the syntax of math environments exist in a grey area, and it's probably not too late now to consider display math an inline element.

So from the discussion so far it seems like a pragmatic way forward is by probing the DOCTYPE and use <math> for HTML5, and <span> for HTML4, both with display: block. From my point of view there doesn't seem to be many side effects if the class names remain the same as it is. It'll obviously break for people who uses div.classname selectors, but isn't that considered bad practice anyways?

William F Hammond

unread,
Apr 18, 2014, 1:15:30 AM4/18/14
to mathja...@googlegroups.com, timt...@gmail.com

On Thu, Apr 17, 2014 at 6:12 PM, <timt...@gmail.com> wrote:
    . . .
In Markdown, lists and tables and such are explicitly block elements and cannot be part of a paragraph. On the other hand, the syntax of math environments exist in a grey area, and it's probably not too late now to consider display math an inline element.

OK, I think you're talking about HTML with TeX-like math markup passed through MathJax rather than normal HTML with MathML.  For the latter it's not grey.  For the former it's the domain of MathJax, and is likely too volatile to try to style on your own.  If you want to be able to write your own CSS for math, it might be better to write LaTeX, profiled for either Tex4ht or LaTeXML, which will generate HTML with MathML, and write your CSS for that.

. . . probing the DOCTYPE and use <math> for HTML5, and <span> for HTML4, both with display: block.

<math> was not allowed in HTML4.  Do you mean XHTML?  HTML5, XHTML5, and XHTML 1.1 + MathML are the major cases today.
 
    -- Bill

timt...@gmail.com

unread,
Apr 18, 2014, 1:31:10 AM4/18/14
to mathja...@googlegroups.com, timt...@gmail.com
<math> was not allowed in HTML4.  Do you mean XHTML?  HTML5, XHTML5, and XHTML 1.1 + MathML are the major cases today.

Probably a reading mistake. I did suggest <span> for HTML4.

OK, I think you're talking about HTML with TeX-like math markup passed through MathJax rather than normal HTML with MathML.  For the latter it's not grey.  For the former it's the domain of MathJax, and is likely too volatile to try to style on your own.  If you want to be able to write your own CSS for math, it might be better to write LaTeX, profiled for either Tex4ht or LaTeXML, which will generate HTML with MathML, and write your CSS for that.

Ah, yes.  MathML doesn't have this problem since it exclusively uses <math>. I'm mainly talking about LaTeX jax input and HTML/CSS max output. Specifically, this kind of situation:

<p>
Let's consider the basic axiom:
\[ a(b+c) = ab + ac, \]
from which our discussion today will stem.
</p>

MathJax with LaTeX input will transform the display expression into a <div>. Depending on the browser, this may or may not result in the following:

<p>
Let's consider the basic axiom:
</p><div class=Mathjax_Display>…</div>
from which our discussion today will stem.
</p>

Hence we end up indadvertedly 2 fragmented paragraphs.

Just to be clear, there's obviously ways to work around this on a case-by-case basis, and I am not actually asking for help in terms of content creation. I'm simply suggesting that perhaps using <span> is an improvement over div. Perhaps mentioning <math> was a red-herring, as that naturally means MathML in most cases.

Davide P. Cervone

unread,
Apr 20, 2014, 10:21:01 AM4/20/14
to mathja...@googlegroups.com
Historically, MathJax used span's for in-line math and div's for displayed content because that made it easier to distinguish them for purposes of styling and so on, and it seemed natural to put the block-level MathML in a block-level element.  Earlier versions of MathJax didn't have the MathJax_Display wrappers, and didn't set the display style explicitly.  Those things came in over time in order to work around pages that set CSS for div's and span's that interfered with MathJax's use of these tags.

You are correct that span's with display:block would work in place of div's, but my understanding is that this is technically illegal as well (though not prevented in practice).

One thing I don't understand is what you are trying to do when you say "semantically relevant HTML".  The output of MathJax is not very semantic, and I think it would be a mistake to try to use that for semantic analysis.  It would be better to either analyze the original HTML (before processing by MathJax), or to obtain the MathML internal representation from MathJax rather than use the convoluted HTML output.  So I'm not really sure what you are after.

As for validation, the output of MathJax is not what would be sent to a validator, but rather the original HTML, so MathJax's output is rather immaterial in terms of that.  One can not save the MathJax HTML output for reuse, since it is dependent on the browser, the OS, the fonts available, the zoom level, and a number of other factors that are specific to each individual user, so it should never end up being kept in some permanent form.

On a technical note, your description of the results of mixing the <div>'s with <p>'s is not quite correct.  What you describe would (almost) be the case if the MathJax output were being processed but the browser's HTML parser (e.g., if the MathJax output were in the HTML file originally).  But the MathJax output is not parsed by the browser; MathJax modifies the DOM by hand by creating and inserting elements into the existing tree.  When MathJax adds a <div> to a <p>, the <div> is inserted as a child of the <p>, and the <p> is not split.  (See example and output below.)  It is only when the browser parses a string of serialized HTML that the behavior you describe occurs.  So in MathJax output, the <div>'s are children of the <p>'s, and there is no ambiguity about whether they are part of the paragraph or not.  Indeed, the distinction between a <span> child of the <p> and a <div> child of the <p> lets you distinguish in-line from display math (so you can get more information in that sense than you could otherwise).  Also, there is no problem with CSS because the paragraph doesn't end and you don't get extra paragraph spacing or other unexpected effects.

Here is an example (available at http://jsfiddle.net/r38f9/ for experimentation):

<!DOCTYPE html>
<html>
<head>
<title>Test of p and div</title>
<style>
  p {margin-bottom: 1em; background-color:red}
  p > div {border: 1px solid black}
  .box {width: 200px; float: left; margin-right:2em}
</style>
</head>
<body>

<div class="box">
  <p>
  Insert here:
  <div>Inserted div</div>
  That was inserted.
  </p>
  <p>Another paragraph.</p>
</div>

<div class="box">
  <p>
  Insert here:
  <span id="insert">Removed</span>
  That was inserted.
  </p>
  <p>Another paragraph.</p>
</div>

<script>
  var ins = document.getElementById("insert");
  var p = ins.parentNode;
  var div = document.createElement("div");
  div.innerHTML = "Inserted div";
  p.insertBefore(div,ins);
  p.removeChild(ins);
</script>

</body>
</html>

This shows the difference between the parsing of the <div> inside a <p> in the original file and a <div> that is inserted into the <p> by javascript.  Here, the paragraphs have a red background and a bottom margin of 1em.  I've also styled a <div> that is a child of a <p> so that it has a black border.  The result is the following:


The left hand column is when the <div> is in the <p> in the HTML file.  As you point out, the <div> does end the initial paragraph, and you get the paragraph spacing above the <div>.  Note, however, that a new paragraph doesn't begin after the <div>; the "This was inserted" is not inside a paragraph as you can tell since it has no red background (it is just a text node in the DOM).  The origin <p> is closed when the <div> starts, but it is not opened again afterward.  The text following the <div> becomes text nodes in the parent of the <p>, and when the </p> is encountered, it is given an opening <p> at that point, producing an empty <p></p>.  That empty <p> adds the paragraph bottom margin, and then the following paragraph is in red, as expected.

The right-hand column has the <div> inserted dynamically by the javascript at the end (which replaces the original <span> with a <div>).  Note that this <div> remains part of the paragraph (it has a red background since it is in the paragraph, there is no paragraph spacing, and the <div> gets the black border from the CSS rule for p > div).

Here is the DOM after the script runs (from the DOM inspector in my browser):


Note that the first column (the first <div> with class "box") contains five elements:  the <p> that was ended at the <div>, the <div> itself, the text node that is not part of a paragraph, the empty <p> produced from the dangling </p>, and the final paragraph.

The second column, however, contains two elements:  the two paragraphs that were originally there.  The first paragraph contains the inserted <div>.  Note that that paragraph is not split by the insertion of the <div>.

So it seems to me that your points (1) and (2) don't really apply (unless you are trying to save the HTML for re-use later, which is not supported, or are using innerHTML to move the HTML around, which will not work properly as it will lose the event handlers that MathJax applies to some elements).  I have checked this in Firefox, Safari, Chrome, Opera, and IE (back to IE7) and all produce this same result.

All that being said, it would certainly be possible to change the <div> to a <span>, but I really don't see what the advantage is.  Since there are people who have been using CSS rules based on <div>'s versus <span>'s, and such a change would break their pages, I would need something more compelling than what I've seen so far to support that change.

Davide

ernest...@gmail.com

unread,
Apr 20, 2014, 10:42:45 PM4/20/14
to mathja...@googlegroups.com, dp...@union.edu
Thanks Davide,
I see how it works now.

yangg...@gmail.com

unread,
Nov 9, 2014, 2:09:06 PM11/9/14
to mathja...@googlegroups.com
This discussion is very disappointing.

What I have been doing is to use Mathjax inside a contenteditble. Because of the div tag, I had to build workarounds and now some of the features are just impossible with this kind of arrangement.

Currently is there a way to change the default tag name for display elements?
Otherwise it means I will just have to wait for KaTeX to mature.

liam.m...@gmail.com

unread,
Nov 17, 2014, 1:03:55 PM11/17/14
to mathja...@googlegroups.com, timt...@gmail.com
Why not something like this? Instead of the dollar signs or "\[ ... \]" to delimit the math, use HTML elements like this:

<p>Here is my inline equation: <span class="mathjax">a(b+c) = ab + ac</span>. I hope you like it.</p>
<p>Here is my block equation:</p>
<div class="mathhjax">a(b+c) = ab + ac</div>

MathJax would render using div or span as appropriate. The processClass and ignoreClass configuration would not be needed; it would just consider anything inside the mathjax elements to be math.

Davide P. Cervone

unread,
Nov 17, 2014, 2:09:08 PM11/17/14
to mathja...@googlegroups.com, timt...@gmail.com
This was the approach used by jsMath, the predecessor to MathJax (but using class="math" rather than class="mathjax").  MathJax includes a preprocessor that handles this format (the jsmath2jax extension), and it could easily be modified to change the class.

There are a couple of reasons that this isn't in common use.  First, it takes a lot more typing to enter <span class="mathjax">...</span> than it does to enter $...$, so you either need a smarter editor or have "helper" functions attached to your keys that insert predefined strings for you.  This certainly works, but takes extra setup.  Second, many people use MathJax in blogs or wiki settings where they are not allowed to enter raw HTML (or the HTML they can enter is limited), and so aren't able to enter such strings into their content-management systems.  Third, people who are used to typing TeX or LaTeX already are used to using $...$ and $$...$$ or \(...\) and \[...\], so they don't want to change to a different entry system.  Fourth, people reusing snippets from TeX code can copy and paste whole paragraphs of text with math without the need to change the math delimiters into HTML tags.

In any case, this doesn't address @yangge1987's complaint, which is about the use of <div> in the output of the HTML-CSS renderer, not the delimiters used for the TeX input format.  Also, what he wants is to be able to include the block math within the paragraph, so he doesn't want to end the paragraph before the block-level math, as in your example.

Davide


--
You received this message because you are subscribed to the Google Groups "MathJax Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mathjax-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages