> would it be possible to get more information about the fontdata.js content?
>
> In particular the VARIANT, RANGES and DELIMITERS.HW members?
VARIANT holds the information for handling the various mathvariant possibilities. So mathvariant="bold" corresponds to VARIANT["bold"]. The contents of the structure include:
fonts: The fonts to use for the variant (they are checking in order and the first one containing the character is the one used),
remap: a structure that specifies characters to remap within this variant (this is done before looking through the fonts). The remapping can map into another variant as in 0x2216:[0x2216,"-TeX-variant"]
bold: indicates the font should have font-weight:bold
italics: indicates that the font should have font-style:italic
There are also offset and variant values that correspond to the RANGES array.
The RANGES array is for remapping groups of characters at once. This was originally intended for mapping variants to the Mathematical Alphabet ranges for the STIX fonts, but is also used to remap Greek and some other characters. Each range has an identifier (given by its "offset" property), and is only used in a VARIANT that has an offset with that letter. E.g., in the STIX data, there is a range
{name: "Alpha", low: 0x41, high: 0x5A, offset: "A"}
and this means that VARIANT's with an offsetA property will have the letters between 0x41 and 0x%a (capital letters) remapped. Because there is a VARIANT defined as
"double-struck": {offsetA: 0x1D538, offsetN: 0x1D7D8,
remap: {0x1D53A: 0x2102, 0x1D53F: 0x210D, 0x1D545: 0x2115, 0x1D547: 0x2119,
0x1D548: 0x211A, 0x1D549: 0x211D, 0x1D551: 0x2124}},
with offsetA:0x1D538, this means that capital letters in this variant will be mapped to the double-struck letter in the STIX fonts beginning at 0x1D538. So <mi mathvariant="double-struck">A</mi> will end up using the character in the STIX fonts at U+1D538. The offsetN takes care of the double-struck numbers.
The remap property says that a few characters are exceptions (there are gaps in the Plane1 characters since some double-struck characters already appear in the letter-like symbols). These are remapped before the RANGES are applied.
The RANGES can also have a remapping, in which case the remapping value is an offset within the range. This used with the Greek letters, for example, to map the san-serif variant to the PUA glyphs in the STIX font for these characters (the remapping handles a few variant symbols that are in different locations in PUA than they are in the Greek and Coptic block).
The RANGES data can also include an "add" property, which is an additional offset for this range. This allows two ranges to use the same offsetX in the VARIANT list. For example, the upper and lower case letters in the Math Alphabets need this because the upper and lower case letters in the ASCII range have several characters between them, but in the Math Alphabet blocks, there are none. So two separate ranges are used, with a common offsetA value, but "add" tells where to start the lower case letters.
A VARIANT value can also include "variantX" (where X is the letter used to identify the RANGES entry), which not only remaps the character positions, but also switches to another VARIANT. So, for example, the lower-case Greek letters in the MathJax normal variant are mapped to the italic variant, since there are no upright lower-case Greek letters in the font set.
As for DELIMITERS.HW, the DELIMITERS object lists the characters that can stretch either vertically or horizontally (MathJax doesn't support stretching in both directions for the same character). The "dir" property tells which direction the character stretches in, and the HW array is a list of characters that are stretched versions of the character. These are in increasing size, and the entries in the array are themselves arrays giving the height or width (depending on whether we are stretching vertically or horizontally) of the character, plus the font where it is found. So
0x0028: // (
{
dir: V, HW: [[1,MAIN],[1.2,SIZE1],[1.8,SIZE2],[2.4,SIZE3],[3.0,SIZE4]],
stretch: {top: [0x239B,SIZE4], ext: [0x239C,SIZE4], bot: [0x239D,SIZE4]}
},
says that U+0028 (left parenthesis) can stretch vertically, and there are five single-character sizes available. These are taken from the MAIN, SIZE1, SIZE2, SIZE3, and SIZE4 fonts, and the heights are 1em,1.2em, 1.8em, 2.4em, and 3.0em respectively. If a parenthesis is needed in a larger size, it is made from the characters specified in the "stretch" property. These give the top, extender, and bottom pieces as a pair (the character number and font).
The HW pairs can also include additional information. The data can actually be [size,font,scale,codepoint], where "size" is the size in em's of the character, "font" is the font from which to take it, "scale" is a decimal number used to scale the character (1.5 is 50% larger, .75 is 25% smaller, 0 is no change, and the default is 0), and "codepoint" is the unicode position of the character to use (default is the position of the delimiter being stretched). So
0x23DC: // top paren
{
dir: H, HW: [[.778,AMS,0,0x2322],[1,MAIN,0,0x2322]],
stretch: {left:[0xE150,SIZE4], rep:[0xE154,SIZE4], right:[0xE151,SIZE4]}
},
says that there are two sizes of U+23DC, and that the first is in the AMS font at U+2322 and is .778em wide, while the second is in the MAIN font at U+2322 and is 1em wide.
The stretchy character data also can include additional data, used for fine tuning the positioning and sizing of the characters. The data is of the form [codepoint,font,dx,dy,scale,dh,dd] where "codepoint" is the unicode position of the character to use, "font" is the font to take it from, "dx" and "dy" are horizontal and vertical offsets to apply to the character, "scale" is a scaling factor used to adjust the character size, and "dh" and "dd" are adjustments to make to the characters height and depth (to make it have more height or depth than its glyph actually has). The latter is used to make things like arrow extenders have the same vertical size as the arrowheads, for example.
I think that covers pretty much all of it. Let me know if something more needs explanation.
Davide