Brainstorming on math support in screen readers

57 views
Skip to first unread message

Aaron Leventhal

unread,
Jan 15, 2009, 8:59:31 AM1/15/09
to free...@googlegroups.com
Let's brainstorm on how a screen reader can make the most math
accessible to the most people.

Looking at our wiki table here:
https://wiki.mozilla.org/Accessibility/Math_Accessibility

Braille support would come from a library. However, there are a number
of libraries available now, each supporting different Braille codes. How
to choose?
The front runners would seem to be liblouisxml and UMCL. Some comparison
here:
* They are both LGPL.
* Liblouisxml works with liblous, which NVDA and Orca already use. I'm
not sure whether that factors in much, and I'm not sure what UMCL uses
for any text it encounters within the math.
* They both can translate from MathML. However, UMCL also translates
from TeX. This is done by first transforming the TeX to MathML, which
liblouisxml could also do. John Boyer mentioned that he's considering
developing a liblouistex, and was looking for feedback on that.
* They both can translate to Nemeth, Marburg and British codes. However,
UMCL supports also adds French and Italian.
* They are both written in C, but have different formats for the tables.
UMCL uses XSLT, and liblouisxml uses a special format it has developed.

Questions that would be good to answer. Given the similarities in
philosphy, license and programming language, is it possible for the two
projects to join forces? My experience says this is unlikely -- once
projects go in different directions, it's nearly impossible to share
code. I'd love for someone to look into the possibility anyway.
Another idea is to come up with a more abstract API for dealing with
Braille translation in general. This API would allow Braille translation
services to be installed like plugins. A user could install a Braille
translation package and the screen reader would add it, possibly
providing preferences if more than 1 translator is registered for the
same type of content. Not only screen readers, but Braille publishing
systems could also benefit from this approach. It would allow
translation engines to compete on speed, features and quality
separately, and make it easier for users to experiment with which ones
work best, mixing and matching depending on their needs.

Now, for text-to-speech, a different approach may be needed. I think
that there is a need to tie more tightly to a given screen reader, so
that the navigation, voices and rest of the experience matches the rest
of the screen reader experience. A screen reader project like NVDA would
probably look for ideas from things like ASTER and latex-access, and
implement something for the community to at least play with, tweaking as
they go. Since latex-access is in Python & GPL as well, it would be
worth investigating whether a few changes can allow to become part of
NVDA. Or perhaps, the same principle of abstracting the API for this
should apply to TTS as well as Braille.

I hope this starts a useful conversation. Thoughts are welcome.

- Aaron

Alastair Irving

unread,
Jan 15, 2009, 3:27:42 PM1/15/09
to free...@googlegroups.com
Hi

I definitely think that abstraction is necesary. All the translators
mentioned have there advantages and disadvantages and it seems
reasonable that someone dealing with a lot of maths would want access to
more than one. I would suggest an application similar to
speech-dispatcher on linux which can handle the different translators
and has a standard API for comunicating with screenreaders. This would
minimise the work needed from individual screenreader developers.

However, the method would have to be fairly flexible. For example, some
translators are going to provide full cursor routing information whereas
others are not. There also needs to be flexibility on the part of the
screenreader depending on the document being viewed. The following
examples spring to mind:
1: Editing a LaTeX document- in this case it would always be required
that everything is translated
2: Webpage with graphic equations with LaTeX alt-tags, (for example
wikipedia) - In this case the screenreader should only translate the
alt-tags and the rest of the text should be treated as normal text and
translated into grade 2.

3: Webpage using mathml: in this case the screenreader needs to know to
extract the mathml source directly from the webpage so it can be past to
the translator.

In conclusion, I think we need a layer of abstraction between
screenreaders and translators, but screenreaders still need to provide a
lot of flexibility in how/when these translators are called and how
there output is presented and navigated.


Alastair

Aaron Leventhal

unread,
Jan 15, 2009, 4:44:04 PM1/15/09
to free...@googlegroups.com
Alastair,

In your mind, what might a rough draft API look like?

- Aaron
>> they go. Since latex-access is in Python& GPL as well, it would be

SusanJ

unread,
Jan 15, 2009, 5:22:38 PM1/15/09
to free-math
I'm confused about the relationship between the translation rules for
braille that would be produced by the math support under consideration
and the official translation rules maintained by the various national
braille authorities. For example, in the US, the BANA rules mandate
that the Nemeth code applies to an entire document, both text and
math; there is no official use of Nemeth just for fragments of math.
(The Nemeth translation rules for text are similar to, but not
identical to, the rules for English Braille American Edition.)

BANA has a Refreshable Braille Technical Committee which I read
somewhere (can't find the link) is considering whether different rules
should apply to refreshable braille as opposed to hardcopy.

Neil Soiffer

unread,
Jan 15, 2009, 6:33:25 PM1/15/09
to free...@googlegroups.com
For MathPlayer, I use a COM interface for both discovery and for communication with liblouis and UMCL (and whatever else comes along that uses that interface).  If we get to the point of discussing details, I can send IDL files.  The basic features are:
1.  Translators register that they support the IMathMLToBraille interface via a standard COM method
2.  Translators support the IMathMLToBraille interface and MathPlayer or Firefox or AT makes use of it.

IMathMLToBraille has several calls, but the main one is
    HRESULT MathMLToBraille([in] BSTR mathml, [in] VARIANT_BOOL isInline, [out] BSTR *braille);
where the result uses the Unicode braille dot patterns and it is expected the AT maps those codes/dots to what is needed by the connected braille display.  With this approach, AT doesn't need to "pre-decide" on liblouis or UMCL -- they, or better, users can decide based on which package best suites their needs.

COM is widely used on Windows.  I assume that there is something similar for Mac and Linux, but I am sure that you know more about this than I do.  Using interfaces allows the projects to develop at their own pace and decouples code, both for development and distribution.  I wish Firefox had gone this route for MathML and other support -- there would have been accessible math for it years ago.


For speech, I suggest a similar approach:  communicate via an interface.  It is how AT works with MSAA (and IA2, etc) and how MathPlayer, via more specialized interfaces, works with lots of AT (http://www.dessci.com/en/solutions/access/atsupport.htm).

   Neil

Aaron Leventhal

unread,
Jan 15, 2009, 6:58:40 PM1/15/09
to free...@googlegroups.com
Hi Neil,

It's great that this is already implemented by liblouis and UMCL! I
didn't realize that. Ultimately, I'd like to see a more generalized
interface than just for math, but this gets us what we need.

I appreciate your thoughts on Mozilla. BTW, back in 95/98 or whatever,
Netscape didn't want to deal with thread safety and the lack of a common
cross-platform transport mechanism. That's why they don't have a simple
pluggable binary component architecture. It would also be a very
difficult change to make, requiring changes in thinking across the project.

That would be great if you sent the IDL files.

- Aaron

Aaron Leventhal

unread,
Jan 15, 2009, 7:03:41 PM1/15/09
to free...@googlegroups.com
Hi Suan,

Thank you for reminding me of that -- I'd forgotten.

It would be difficult to know to switch the entire document based on any
math within it, but I supposed that's possible. Screen readers scan the
entire document on load anyway. But imagine, this is the web, and
content can be inserted dynamically. Thus, a math image could be
inserted and now the entire document must change it's Braille
translation. That would be odd. Also, I can imagine users that don't
know Nemeth getting very confused as they read an article that just
happened to have one equation in it.

Therefore, I suspect we may need to go with some kind of begin Nemeth &&
end Nemeth marker. Ultimately though, whatever screen reader users
prefer for their own use is what we'll go with. I think the BANA rules
were intended for books with math in them. Does that make sense?

- Aaron

Dominique Archambault

unread,
Jan 16, 2009, 4:26:50 AM1/16/09
to free...@googlegroups.com
Hi Aaron and all

I can say a few more things on UMCL.
I'm developping it so it is mostly subjective ;-)

- it is crossplatform and was tested on linux/Mac/windows
- it has an option to process whole documents including text/maths. the output in that case is xhtml including the maths in a span with a specific calss. this can be tuned if needed
- umcl model allows synchronisation. what we mean with synchronisation is that a software can allow sighted/blind users to work together with cross modal pointing possibilities. We had a prototype - which was evaluated with users in classrooms in France and Austria - where Braille readers can "click" on cursor routing keys and makes the corresponding graphical sign appear on different background, and on the other direction the teacher can click on the formula and make the corresponding Braille symbol underlined on the Braille display. the model is has a specfic linear output including Braille symbols and references. The library has also a "cleaning" function which removes all additional information to output a simple Braille string (bath in printable ascii and in unicode). the maths symbols are also separated in the output so the software can break lines at relevant places if needed
- we have wrappers for Python and JAVA (based on swig). PHP is under dev

the faults of umcl:
- a bit slow especially on windows even if we use libraries natively compiled with MS compilers. it comes from XSLT. it is 4 times faster under linux and I can't really explain why.
- we miss development force so developments are slow
- well there might be others. please Neil you used it so you can fire now ;-)

About an API well actually the idea of UMCL is to make an API with a main module which abstracts the thing. if an application is compiled with the main module then any module can be added later on the computer and the library will find it.
it is very easy to make a module from an existing translator if this translator is written in portable C (especially WITHOUT using any microsoft specific library) or in XSLT. for other languages it needs a wrapper and to be careful about memory allocation but there is no major problem

I hope a release mid february on sourceforge

i attach to this file the current .h files (not release yet but these will probably not be changed except I'm not sure - and miss time to check - if the unicode output function is there yet in the header file - it is in the library though and works).
- umcl.h is to be used by application developpers
- umcli.h is to be used by input modules develppers (from any format to MathML)
- umclo.h is to be used by output modules developers (from MathML to any format)

oh another specificity is that our central format is what we call Canonical MathML. This is actually valid MathML but where each mathematical structure must be coded in specific way. if an input module outputs directly canonical mathml a lot of time is saved.

regards and best wishes to all
dom



2009/1/16 Aaron Leventhal <aa...@moonset.net>



--
Dominique Archambault
Université Pierre et Marie Curie - Paris
http://chezdom.net/blog
umcl.h
umcli.h
umclo.h

SusanJ

unread,
Jan 16, 2009, 10:27:55 AM1/16/09
to free-math
Dom, congratulations on a very exciting work. I'm looking forward to
examining it closely.

Meanwhile, I noted that you mention that XSLT is slow. We made a
similar observation quite a few years back. However, I've recently
read that the newest XSLT transformers, which support XSLT 2.0 as
well, are supposed to be significantly faster. Perhaps someone else
has more specific information.























SusanJ

unread,
Jan 16, 2009, 1:54:02 PM1/16/09
to free-math
Since we're brainstorming, I'd like to comment on the relationship
between Dom's "cross modal pointing possibilities" and my idea for
extended braille.

Quite a while ago, I had the idea of a single underlying electronic
representation for both braille and print which I call extended
braille. One use of extended braille, which is one-for-one with print
in the case of text, allows a sighted user to simply change the font
in an extended braille file to toggle between a print-like
representation (DotlessBraille) and simulated braille. Here are some
screen captures for a literary braille example plus an explanation of
extended braille which I posted over five years ago:
http://www.dotlessbraille.org/screencap.htm
(Although I didn't know it at the time, I think Aaron may have
implemented something similar in MegaDots even earlier.)

I pretty much gave up on this idea becayse I was never able to
intererest any sighted US teachers or transcribers of braille. They
seem to feel VERY strongly about the importance of reading braille
dots visually.

Since I had originally been thinking only of static use, I was happy
to read of Dom's prototype for dynamic interaction such that when a
print reader selects a print symbol in their view of a document on a
computer screen, the corresponding symbol is "highlighted" on a
braille display and vice versa.

With math we have a somewhat different problem than with text because
print math is not necessarily linear nor in one-to-one correspondence
with the braille.

In his recent presentation that Dom linked to on his blog, he
advocates that, for collaborative efforts, each user should have the
display in a "natural representation." For example, with a fraction,
the print reader would have the numerator displayed above the
denominator while the braille reader has them displayed side-by-side
together with markup.

I'm not sure how Dom has accomplished this. The natural representation
approach is in contrast to the simpler approach taken by me and also
by the Lambda project which present sthe print math in the same linear
form as the braille math by using special "sighted-readable" glyphs to
represent the embedded markup that is part of the braille
representation.

I would say now that both representations have their place. In a
school setting where the print reader is a math teacher and the
braille reader is a student, it is important for the teacher to use
the same representation as the student. However, in collaborative work
with peers, the natural representation is appealing. I will point
out, however, that the sighted scientists I know who use LaTeX are
very quick at reading and writing linear LaTeX source and rarely use
WYSIWYG editors for math entry.

Neil Soiffer

unread,
Jan 16, 2009, 4:26:17 PM1/16/09
to free...@googlegroups.com
Aaron asked me to post the full IDL file for the IMathMLToBraille interface used by UMCL, liblouis, and MathPlayer.  Here it is:

// IMathMLToBraille.idl
//

import "oaidl.idl";
import "ocidl.idl";

[
    object,
    uuid(32F66A2D-7614-11D4-BD11-00104BD3F987),
    dual,
    nonextensible,
    helpstring("IMathMLToBraille Interface"),
    pointer_default(unique)
]
interface IMathMLToBraille : IUnknown {
    // This is the "main" call. A MathML string is passed in along with whether it is an inline or display expression.
    //        mathml -- the MathML string to be translated.  It should use either Unicode or numeric entities.
    //            The MathML should consist of presentation MathML with a namespace declaration on the math tag.
    //        isInline -- true if the MathML is inline (instead of block/display)
    //    braille -- the translation based upon the values set by SetMathCode and SetBrailleWidth.
    //            Use SysAllocString to allocate the BSTR.
    //            The result should use the Unicode Braille characters (http://www.unicode.org/charts/PDF/U2800.pdf)
    // Returns either E_NOT_SUPPORTED if the braille code that was set is not supported or it returns a success code (S_OK).

    HRESULT MathMLToBraille([in] BSTR mathml, [in] VARIANT_BOOL isInline, [out] BSTR *braille);

    //    Get a "list" of supported braille math codes.
    //    See list below for suggested names -- any name is acceptable as long as it is accepted by SetMathCode.
    //
    //        mathCodes -- an enumeration interface allowing the caller to walk through the enumeration and get the string
    //                for each supported math code. See the MSDN documentation on IEnumString for information on its use.
    //                Note:  don't use SysFreeString to deallocate as these are not allocated using SysAllocString
    //                  This is not mentioned in the MSDN documentation; use CoTaskMemFree instead.
    HRESULT EnumMathCodes([out] IEnumString ** mathCodes);

    // Get basic information about the translator.
    //        name -- the name of the translator.  This should be a name that users recognize (see list below for suggested names).
    //        version -- the version of the translator (eg, "1.0.3" or "2.1c")
    //        description -- a one or two sentence description of the translator and its features
    // Any of these arguments may be 0 (Null).
      HRESULT GetTranslatorInfo([out] BSTR *name, [out] BSTR *version, [out] BSTR *description);

   // Get/Set the braille math code to use. If the value is not set, then GetMathCode returns E_NOTSET.
    HRESULT GetMathCode([out] BSTR *codeName);
    HRESULT SetMathCode([in] BSTR codeName);

   // Get/Set the number of braille cells available on a line.
    // A value of 0 means that the line has "infinite" width. This might be used if a user prefers to
    //   horizontally their refreshable display. The default value is 0.
    HRESULT GetBrailleWidth([out] long *width);
    HRESULT SetBrailleWidth([in] long width);

   // Get/Set any options specific to a translator. The default value is an empty string.
    HRESULT GetMathOptions([out] BSTR *options);
    HRESULT SetMathOptions([in] BSTR options);
};

Reply all
Reply to author
Forward
0 new messages