Restrictions on ToC nav contents?

Peter Hatch

unread,

Mar 11, 2013, 1:46:36 AM3/11/13

to epu...@googlegroups.com

EPUB 3 has restrictions on the allowed content of the Table of
Contents nav element to make sure it is machine-readable; I'm
wondering what specific benefits are gained from that, and if they are
significant enough to make up for the complexity - it'd be nice if e0
could simply allow any HTML there.

--
Peter Hatch

Bill McCoy

unread,

Mar 11, 2013, 9:53:50 AM3/11/13

to Peter Hatch, epu...@googlegroups.com

Most EPUB reading systems display TOC by extracting the data and using their own UX. In EPUB 2 this was the only option as NCX was not presentational markup. So it remains to be seen whether simply rendering Navigation Document will become more popular (of course a TOC is usually part of the "spine" of the book so Nav Doc being HTML it can now serve both purposes).

There's been preliminary discussion of defining some metadata so that publications can indicate a preference around if & how the Nav Doc should be rendered vs. the data extracted, this could enable fancier custom navigation to be more easily added (a multi-level TOC a la help systems and Inkling titls, a film-strip style TOC a la Apple iBooks Author content). But this is not yet being worked on.

A reliably machine-readable TOC is of course also very important for accessibility but it's not clear to me whether e0 aspires to support accessibility use cases.

--Bill

Hadrien Gardeur

unread,

Mar 11, 2013, 10:09:16 AM3/11/13

to Peter Hatch, epub-ng

A machine-readable TOC is absolutely necessary, for multiple reasons:

accessibility
because we use it for our equivalent of a spine (a reading order)
it's widely used in the UI of popular RS

Now the real question is: which restrictions are required to make such an element easily machine-readable ?

Peter Hatch

unread,

Mar 12, 2013, 1:22:49 AM3/12/13

to Hadrien Gardeur, epub-ng

Can you be more specific about what is needed for accessibility?
Arbitrary HTML is accessible enough for the rest of the book - what
exactly does the Table of Contents need in addition to that? (Not
questioning that there is something, I just don't know what it is.)

For the spine, I think it would work to simply find every <a> element
with a href attribute, and use those, in order. I don't think any
restrictions are necessary.

I don't think we should complicate the spec in order to support
backwards-compatibility for popular reading systems, especially since
a non-standard rendering of the HTML makes testing harder.

--
Peter Hatch

Hadrien Gardeur

unread,

Mar 12, 2013, 12:37:25 PM3/12/13

to Peter Hatch, epub-ng

I'd like to hear what Markus or Bill have to say about that, and why these particular restrictions were decided in the first place for EPUB3.

Daniel Weck

unread,

Mar 12, 2013, 12:53:57 PM3/12/13

to Peter Hatch, Hadrien Gardeur, epub-ng

Please bare with me here: I am not suggesting any particular design route for e0's TOC. However, I thought it would be useful ; in the context of this discussion ; to recap the rationale for EPUB3's NavDoc. As you know, the EPUB3 Navigation Document serves a dual purpose, which is why it adheres to a well-defined HTML5 subset. Authoring guidelines / good practices alone would not be sufficient to guarantee interoperability: just like an index or a dictionary, a NavDoc implements a predictable "microdata" format suitable for machine processing, and in this case also geared towards populating an arbitrary reading system's user interface (tree / lists).

The NavDoc (or the NCX for that matter) enables users who depend on assistive technology (e.g. screen reader) to consistently access a reliable navigation overlay for the publication they are reading. The fact that the HTML5 content model of a NavDoc is restrictive means that authors and / or production tools are less likely to break this basic ; albeit fundamental ; access mechanism.

In my opinion, the lowest functional requirement for a TOC ; or any other useful type of landmarks, such as List Of Illustrations ; is that each node in a tree-like data structure (i.e. nested ordered lists) carries an easily-extractible textual label. A node item could be a mathematical expression, an image, a canvas painted via scripting, an interactive gizmo, etc. or even a combination of these, but the bottom line is that a simple text description of the hyperlink target should always be made available to a11y APIs (i.e. machine-extractible, when not directly displayed via the HTML/SVG/MathML/CSS rendering stack).

Personally, I think it should be possible to programmatically verify that this contract is fulfilled, and automated validation goes a long way into achieving this. I therefore feel that markup restrictions (HTML5 structure and semantics) would make e0's TOC mechanism more robust, accessible and interoperable. Authoring guidelines are useful too, especially for more subjective issues (e.g. information overload in node items), so the usual checklists apply here (e.g. WAI).

Cheers, Daniel

Bill McCoy

unread,

Mar 12, 2013, 2:42:23 PM3/12/13

to Hadrien Gardeur, Peter Hatch, epub-ng

Markus would have to speak to the architectural considerations as the
Nav Document was his creation, but perhaps Daniel's subsequent email
covered it.

Stepping back to 50K foot level, if I give you 100,000 XHTML documents
that contain a TOC, but nothing more - even if i tell you the element
that's the root of the TOC - there is is no reliable way for you to
write a program that will present a consistent, accessible UX for that
TOC across all the documents. The motivation for specific restrictions
around the form and structure of a machine-readable TOC for the Nav
Document was to enable this to work.

Whether the particular restrictions decided on were optimal or not of
course is another question. But they clearly work. I saw yesterday
another brand-new cloud-based EPUB 3 reading system I'd never seen
before, deployed live by several publishers, with good TOC support
including integration with JAWS screen reader. I don't personally
believe there's a good reason to consider alternative to what's in
EPUB 3 already unless there's some clear argument that what's there
now is not "good enough" & that the pain of doing something else would
be worth the cost of fragmentation of two ways to do the same thing.
I haven't heard any such argument expressed for Nav Document.

Of course if it's proposed to allow non-XML content... "tag soup"
HTML... i.e. expect every SW that manipulates file to have to
implement http://www.w3.org/html/wg/drafts/html/master/syntax.html#parsing
... then it's arguable that that's a bigger barrier to machine
processing in general as well as specifically for TOC.

From the discussion I've heard on this list so far I have to say I'm
still more a fan of a modestly reformed e0 as a direction for
longer-term future of EPUB:
1. Remove requirement for MIMETYPE
2. Make spine structures optional / recast as XHTML microdata / align
with what W3C ultimately does for packaged apps
3. Allow HTML only in container-constrained content (e.g. iframes) -
there is no need to paginate this stuff and there's more gain and less
pain to enable off-the-shelf widgets to be used as-is, while still
having the page-level content be XHTML.

This would yield simple stuff that could still be backwards compatible
with existing EPUB 2/3 (EPUB 2 reading systems aren't going to try to
process widgets so enabling "tag soup" there could be OK).

Not quite a revolution but I haven't heard any compelling overall
benefits that would be gained from a deviant format. And assuming you
place some weight on a unified global open standard there's benefit to
*not* forking/fragmenting. Of course there may be some on this list
who don't support the goal of a unified standard - but I do as I've
seen the benefits that establishing even a just barely "good enough"
standard can deliver (PostScript, PDF, OpenType, JPEG, HTML/CSS,
JavaScript....).

And I still think the bigger fish to fry isn't trying to ex post facto
bikeshed EPUB 3 decisions for packaged publications, but to tackle the
unsolve problems in representing networked, distributed publications.
That's what gets my juices flowing about an "EPUB NG" but there so far
seems little appetite on this group to tackle it.

--BIll

Peter Hatch

unread,

Mar 13, 2013, 1:32:11 AM3/13/13

to Bill McCoy, Hadrien Gardeur, epub-ng

On Tue, Mar 12, 2013 at 11:42 AM, Bill McCoy <whm...@gmail.com> wrote:
> Stepping back to 50K foot level, if I give you 100,000 XHTML documents
> that contain a TOC, but nothing more - even if i tell you the element
> that's the root of the TOC - there is is no reliable way for you to
> write a program that will present a consistent, accessible UX for that
> TOC across all the documents. The motivation for specific restrictions
> around the form and structure of a machine-readable TOC for the Nav
> Document was to enable this to work.

I seem to be missing what exactly it is you want from the TOC that a
standard browser would not provide. It seems to me that a browser is
itself a consistent, accessible UX for any XHTML document, which is
why that's used for the vast majority of the ebook content - what
specific behavior is enabled by having restrictions for the TOC?

--
Peter Hatch

Daniel Weck

unread,

Mar 13, 2013, 7:39:00 AM3/13/13

to Peter Hatch, Bill McCoy, Hadrien Gardeur, epub-ng

Hi Peter,

yes, modern web browsers are expected to properly expose structural and semantic information to assistive technologies, so from this general perspective there is indeed no intrinsic need to impose restrictions on markup syntax, in order for content to be accessible. However, this low-level technical plumbing alone does not ensure content accessibility. In the case of TOC and other lists of publication landmarks, users have certain cognitive expectations as to what the content model should be (tree, lists, text labels - see my previous email).

The TOC/landmarks meta-information is first-class citizen within a publication, and I don't think that marking an HTML5 fragment as such (e.g. using epub:type) is sufficient. By imposing a content model on authors / production tools, the navigation meta-information can be reliably and consistently processed, and exposed to end-users with a good degree of user-interface affordance. That is not to say that a TOC should always consist in a "boring" tree of text labels (see my previous email, there is scope for innovative TOCs), but in my view there is certainly a need for a lowest common denominator, that can be checked structurally and semantically, in addition to being subjected to more subjective assertions (see my previous email about schema validation vs. good authoring practices / guidelines).

I hope this answers your question "what specific behavior is enabled by having restrictions for the TOC?".

PS: I am happy to be contradicted, and to consider a "free-for-all" TOC model where hyperlinks are automatically extracted in document order...but so far I have not seen convincing arguments.

;)

Cheers, Daniel

Baldur Bjarnason

unread,

Mar 13, 2013, 8:49:52 AM3/13/13

to Peter Hatch, Bill McCoy, epub-ng

On 13 Mar 2013, at 05:32, Peter Hatch <peter...@gmail.com> wrote:
>
> I seem to be missing what exactly it is you want from the TOC that a
> standard browser would not provide. It seems to me that a browser is
> itself a consistent, accessible UX for any XHTML document, which is
> why that's used for the vast majority of the ebook content - what
> specific behavior is enabled by having restrictions for the TOC?

The basic structure of the EPUB3 navigation document has a few very useful accessibility features.

The list-embedded-in-a-nav pattern is one that is natively supported by most modern screen readers and it's a pattern described directly in the various HTML5 specs. By requiring the toc to be an OL in a NAV we are telling blind users that this is a primary navigation element containing links where the order of the links is meaningful (i.e. read them in this order). There isn't any other pattern that gets that meaning across to a blind or partially sighted reader as easily, i.e. we'd end up with something more complicated than an OL in a NAV to get across the same thing.

And once you add the optional header you have something that describes its meaning very clearly to everybody, machines, the sighted, and those without sight.

This isn't a hypothetical benefit, by the way, screen readers already support the nav element.

In fact, this is major reason why I think that navigation documents with multiple navs (i.e. landmark, list of images, different reading order, etc.) are problematic.

Supporting epub:type metadata natively isn't on the cards with any of the major screen readers as far as I can tell, especially not in the web browser context. To screen readers it would just be a document full of navigation lists, each one as important as the next, with little information as to each one's purpose, especially since the nav heading is optional. A navigation document that has a separate reading order from its TOC and is put on the web is going to be confusing to screen reader users.

I'm not against metadata described using meta elements or relationships described using link elements and rels. And adding machine-readable metadata using RDFA lite or microdata is also useful. All of those features can be ignored safely by screen readers (or even ignored by most simple reading systems and web browsers). It's the multiple identical navs thing that bothers me. It wouldn't be an issue if we could trust people to add headings to their lists, but relying on authors and devs to do work they don't have to do is usually a mistake.

On 12 Mar 2013, at 18:42, Bill McCoy <whm...@gmail.com> wrote:
>
> Of course if it's proposed to allow non-XML content... "tag soup"
> HTML... i.e. expect every SW that manipulates file to have to
> implement http://www.w3.org/html/wg/drafts/html/master/syntax.html#parsing
> ... then it's arguable that that's a bigger barrier to machine
> processing in general as well as specifically for TOC.
>

Like I've been trying to tell you, HTML5 is not a barrier to machine processing *at all*. There is a host of streaming parsers for HTML in all major languages and most of the big XML libraries either parse HTML or interface with HTML parsers that emit library-native node trees. For example, libxml, the foundation of many major XML libraries, supports HTML parsing natively.

You forget that when HTML5 parsing was standardised it was hashed out by implementing HTML5 parsers in various languages, the parsing spec is practically a parser in pseudocode. Finding a major programming language without a capable HTML5 parser implementation is extremely difficult.

If you are unlucky enough to not be able to find a parser that doesn't do what you need on its own, using one to pull valid HTML5 into an XML workflow is *trivial*. Machine processing HTML is a *huge* industry. It's a major part of many web apps and internal systems because many corporations have a *lot* of data in HTML. Many text searching systems and libraries support indexing HTML natively (or have plugins that do) for this very reason. For most use cases, there is very little to separate valid HTML5 and valid XHTML in terms of speed, availability of tools, and ease of development. And for those few cases where XHTML is demonstrably superior, you just mandate that your group, department, or corporation use XHTML.

Judging from the way you keep defending XHTML, you seem to be thinking that by supporting HTML we'd have to ban XHTML. That is very silly and completely untrue. Those whose lives would be made easier by XHTML use XHTML. Those whose lives would be made easier by using HTML use HTML.

HTML is *everywhere*. It's being used as a document format, archive format, app format, interchange format, you name it. Not supporting HTML is a *major* omission since it is a huge part of both modern IT and web development.

HTML5 has the added advantage over XHTML of being simpler, easier to create, better aligned with web development practices, and more widely supported in browsers. This benefits a lot of people. Not supporting it in e0 would be a major blow against the format.

Also, you implied earlier that XHTML might have a more capable security model than HTML. This is nonsense since they share the exact same security model. The strict parsing of XML is *not* a security feature and it'd be very foolish to rely on it for security. You are always going to have to escape untrusted content that you inject into your pages.

The same origin system is flawed but it's a system used by both XHTML and HTML. There is a lot of work coming out of W3C and ECMA that provides a very interesting and very useful security model (i.e. Content Security Model and Secure ECMAscript). And, again, these are security features that will be shared by HTML and XHTML.

> i.e. requiring support for *both* HTML "tag soup" and XHTML is "simpler" only for the hand-coding author and for the use case of direct website deployment, but is actually more complicated for every other use case

This is also not true. Supporting HTML, combined with a rendering model that is better aligned with web standards, would also make it drastically simpler for adapting web content into a packaged ebook. This is an issue a lot of people are facing at the moment. Having to parse or transform every page is a lot more hassle than just adding a sensible index.html and zipping everything up. This also happens to be one of the basic use cases described at the start of this discussion. You really shouldn't dismiss it so readily since it's exactly the feature that motivates a lot of us participating in this discussion.

And for most use cases, supporting HTML is not more complicated, as I said above.

- best
- baldur

Dave Cramer

unread,

Mar 13, 2013, 9:31:56 AM3/13/13

to Baldur Bjarnason, Peter Hatch, Bill McCoy, epub-ng

Peter, do you have a use case for dropping the EPUB3 requirements for nav, or is it a more general concern about complexity?

I think of an ordered list as being the ideal markup for a table of contents. The function of a toc depends on links, and depends on labels for those links. So for simple cases, you're already looking at something much like what EPUB3 uses.

e0 is trying to make very few demands on content authors, but this feels like the one place where the benefits of a prescribed approach are worth it, for the reasons that many other commentators have given.

Dave

Hadrien Gardeur

unread,

Mar 13, 2013, 9:49:14 AM3/13/13

to Bill McCoy, Peter Hatch, epub-ng

From the discussion I've heard on this list so far I have to say I'm
still more a fan of a modestly reformed e0 as a direction for
longer-term future of EPUB:
1. Remove requirement for MIMETYPE
2. Make spine structures optional / recast as XHTML microdata / align
with what W3C ultimately does for packaged apps
3. Allow HTML only in container-constrained content (e.g. iframes) -
there is no need to paginate this stuff and there's more gain and less
pain to enable off-the-shelf widgets to be used as-is, while still
having the page-level content be XHTML.

This would yield simple stuff that could still be backwards compatible
with existing EPUB 2/3 (EPUB 2 reading systems aren't going to try to
process widgets so enabling "tag soup" there could be OK).

At this point, EPUB NG/Zero would not be compatible with EPUB2 or EPUB3 but you could easy generate a compatible EPUB2/3 file (by generating the OPF based on what's included in the ZIP and in index.html) or create files that are both valid EPUB NG and EPUB 3.

Not quite a revolution but I haven't heard any compelling overall
benefits that would be gained from a deviant format. And assuming you
place some weight on a unified global open standard there's benefit to
*not* forking/fragmenting. Of course there may be some on this list
who don't support the goal of a unified standard - but I do as I've
seen the benefits that establishing even a just barely "good enough"
standard can deliver (PostScript, PDF, OpenType, JPEG, HTML/CSS,
JavaScript....).

So far, the goal has been to make something less complex than EPUB3 and more in line with pure Web standards. I believe that this has been the case so far for the decisions that we've agreed on.

Certainly not a revolution, but I don't thinks that a "revolution" was the goal of this project.

And I still think the bigger fish to fry isn't trying to ex post facto
bikeshed EPUB 3 decisions for packaged publications, but to tackle the
unsolve problems in representing networked, distributed publications.
That's what gets my juices flowing about an "EPUB NG" but there so far
seems little appetite on this group to tackle it.

I'll talk about this in another thread (we still need to talk about a few things regarding the <nav> element in index.html), but I'm a little puzzled by the fact that you're very interested in unpackaged publications (with resources spread across the Web), yet hostile towards HTML. We certainly can't expect a distributed publication to be XHTML only.

Hadrien Gardeur

unread,

Mar 13, 2013, 9:52:20 AM3/13/13

to Baldur Bjarnason, Peter Hatch, Bill McCoy, epub-ng

It's the multiple identical navs thing that bothers me. It wouldn't be an issue if we could trust people to add headings to their lists, but relying on authors and devs to do work they don't have to do is usually a mistake.

They won't be identical. Reading order would be entirely optional, if you don't need a different reading order, just use a single nav.

Hadrien Gardeur

unread,

Mar 13, 2013, 9:57:07 AM3/13/13

to Dave Cramer, Baldur Bjarnason, Peter Hatch, Bill McCoy, epub-ng

Peter, do you have a use case for dropping the EPUB3 requirements for nav, or is it a more general concern about complexity?

I think of an ordered list as being the ideal markup for a table of contents. The function of a toc depends on links, and depends on labels for those links. So for simple cases, you're already looking at something much like what EPUB3 uses.

e0 is trying to make very few demands on content authors, but this feels like the one place where the benefits of a prescribed approach are worth it, for the reasons that many other commentators have given.

Agree, for index.html we want to make sure that we're as machine-readable as we can (including screen readers). Bill mentioned that some people won't be happy about using HTML for such documents instead of JSON: I guess that this is also what he meant by that. People will expect for index.html something as close as possible to a simple machine-readable document (like JSON), yet we want to make sure that we also have somethig that we can display in a browser (HTML+CSS).

I still hope that Markus will see this thread though and explain carefully why each restriction was decided for EPUB3.

Bill McCoy

unread,

Mar 13, 2013, 11:01:02 AM3/13/13

to Baldur Bjarnason, Peter Hatch, epub-ng

Hi Baldur,

No I wasn't thinking that XHTML would be banned. But your statement

"Those whose lives would be made easier by XHTML use XHTML. Those

whose lives would be made easier by using HTML use HTML" ignores that
software that needs to process elements of an HTML+XHTML supporting e0
would need to support parsing *two* syntaxes whereas presently they
only have to support *one* (and that one happens to be directly
validatable, has support built into every dev environment, and is also
needed for SVG and MathML). Pretty much only some hand-coding authors
lives would (potentially) be made easier by HTML, everyone else's
lives would be made harder.

Re: HTML parsing. OK maybe parsing of HTML is not quite as challenged
as I imagined but nor is it nearly as built-in to dev environments as
XML. Nor is it as robustly well-specified. You cited libxml for
example: the module there that supports HTML is stated as supporting
HTML 4.0, which implies no only that it may not handle HTML5 content
but also that it was done before the "tag soup" parsing rules were
formalized during HTML5 spec development - the documentation says that
"It should be able to parse 'real world' HTML, even if severely
broken" but that is a pretty vague statement. libxml is also a pretty
heavy-weight component that's mainly DOM-oriented, its stream-mode is
considered pretty primitive. The main stream-oriented XML parser, SAX,
can't handle HTML.

Anyway I stand by the point that XML parsing is a more trivial task
than HTML+XML parsing.

But this doesn't necessarily mean HTML shouldn't be supported in a
future e0, only that there are costs to consider as well as benefits,
and that the costs include complexity of processing as well as
breaking backwards compatibility. Again IMO I see the most benefits
and least cost in enabling HTML in container-constrained content
(iframes and so on) as widgets aren't primary document content, are
where hand-coding is most common, wouldn't generally need to be
manipulated by intermediate software or reading systems, and could
even enable content that still works on existing EPUB 2 reading
systems that ignore iframes. And I think the widget ecosystem is
clearly more advanced in the HTML world so we get the most gain from
enabling HTML here, less so for document content. Just an idea... and
admittedly yet another compromise proposal... ;-)

--Bill

Reply all

Reply to author

Forward