Blog post on navigation and reading order

Dave Cramer

unread,

Feb 27, 2013, 10:36:30 PM2/27/13

to epub-ng

I wrote up sample code for a few possibilities:

http://epubzero.blogspot.com/2013/02/spineless-part-two.html

Dave

Kjartan Müller

unread,

Feb 28, 2013, 3:01:25 AM2/28/13

to epu...@googlegroups.com

I would rather prefer something like this:

<nav itemscope itemtype=".../toc">
  <ol>
    <li><a itemprop="spine" href="file1.html">Chapter 1</a></li>
    <li><a href="file1-quiz.html">Chapter 1 Review</a></li>
    <li><a itemprop="spine" href="file2.html">File 2</a></li>

    <li><a href="file2-quiz.html">Chapter 2 Review</a></li>
    <li><a itemprop="spine" href="file3.html">Chapter 3</a></li>

    <li><a href="file3-quiz.html">Chapter 3 Review</a></li>
  </ol>
</nav>

Kjartan

Dave Cramer

unread,

Feb 28, 2013, 12:37:32 PM2/28/13

to Kjartan Müller, epub-ng

Should the default (with no itemprop) be the case where a document is outside the linear reading order of the book? To me, this seems to make simple cases more difficult. I'd rather mark things that aren't in the reading order, just as itemref in EPUB defaults to linear="yes".

Dave

Baldur Bjarnason

unread,

Feb 28, 2013, 1:59:46 PM2/28/13

to epu...@googlegroups.com

First reaction: I'm not that much of a fan of Perl's motto that easy things should be easy and hard things should be possible. It's a motto of a programming language that is an utter pain to work with and so immediately suspect :-D

In my experience hard things break as soon as users and devs get their hands on them and HTML5 is much too complicated to begin with. Adding anything to it is risky and likely to break in the hands of users. What we want is the other variation of the motto: make the hard things easy and the easy elegant.

To that end I'd like to suggest that we just say that defining a reading order that diverges from the TOC is simply out of scope for e0.

And that those who require that behaviour define among themselves a way of representing it in a file that can be linked in the index.html head with a <link rel="spine" href="" type=""> where the type indicates the format used to describe the behaviour (my suggestion would be JSON).

In fact, I'd suggest that a lot of the features people demand should be community-led extensions that describe either specific schemas for microdata/RDFA lite or a JSON format.

IMNSHO, the base format should, on its own, be laughably ridiculously brain-dead simple to create and use and have no major features that aren't a part of what browsers support anyway. Like Daniel said in his requirements email, making an ebook shouldn't be more complicated than zipping up a website.

- best
- baldur

Dave Cramer

unread,

Feb 28, 2013, 2:30:49 PM2/28/13

to Baldur Bjarnason, epub-ng

On Thu, Feb 28, 2013 at 1:59 PM, Baldur Bjarnason <baldur.b...@gmail.com> wrote:

To that end I'd like to suggest that we just say that defining a reading order that diverges from the TOC is simply out of scope for e0.

I've been pushing on this issue a bit because an implementation is in the works (not from me!) which deals with complex educational materials, where such divergence is deemed important.

Another approach would be to say that this is a restriction on the toc—if you don't want a doc in the primary reading order, just don't link to it in the nav in index.html.

And that those who require that behaviour define among themselves a way of representing it in a file that can be linked in the index.html head with a <link rel="spine" href="" type=""> where the type indicates the format used to describe the behaviour (my suggestion would be JSON).

In fact, I'd suggest that a lot of the features people demand should be community-led extensions that describe either specific schemas for microdata/RDFA lite or a JSON format.

IMNSHO, the base format should, on its own, be laughably ridiculously brain-dead simple to create and use and have no major features that aren't a part of what browsers support anyway. Like Daniel said in his requirements email, making an ebook shouldn't be more complicated than zipping up a website.

Do you think index.html with required nav meets that test, or comes close enough for our purposes? The only simpler idea I can think of is to have no "glue" file at all, and just have the reading system open the docs in the zip in order. But then the order would be language-dependant, and we'd argue just as much... :)

Dave

Alberto Pettarin

unread,

Feb 28, 2013, 2:34:27 PM2/28/13

to epu...@googlegroups.com

On 02/28/2013 07:59 PM, Baldur Bjarnason wrote:
>
>
> On Thursday, February 28, 2013 3:36:30 AM UTC, Dave Cramer wrote:
>
> I wrote up sample code for a few possibilities:
>
> http://epubzero.blogspot.com/2013/02/spineless-part-two.html
> <http://epubzero.blogspot.com/2013/02/spineless-part-two.html>
>
> Dave
>
>
> First reaction: I'm not that much of a fan of Perl's motto that easy
> things should be easy and hard things should be possible. It's a motto
> of a programming language that is an utter pain to work with and so
> immediately suspect :-D
>
> In my experience hard things break as soon as users and devs get their
> hands on them and HTML5 is much too complicated to begin with. Adding
> anything to it is risky and likely to break in the hands of users. What
> we want is the other variation of the motto: make the hard things easy
> and the easy elegant.
>
> To that end I'd like to suggest that we just say that defining a reading
> order that diverges from the TOC is simply out of scope for e0.

I like the latter approach, but --- if I understand your proposal
correctly --- I have an objection.

Let me observe that "reading order" (I propose to call it "default
resource order", since it might be a list of MP3 files) and "ToC" are:
a) from an abstract point of view, two sufficiently simple, distinct
concepts, and they are implied by a definition of "book" that covers a
large spectrum of different "contents" (fiction, manuals, school books,
audioebooks, etc.); and
b) from a practical point of view, very convenient to have in e0, since
they are very likely to be used by producers and, more importantly, to
be required by (human) readers.

Hence, I second the proposal of having the opportunity in e0 to
explicitly declare them (in an easy way, both to write and to parse),
like in Dave's first proposal, which, by the way, let us define a
ToC/"default resource order" not limited to HTML pages.

> And that those who require that behaviour define among themselves a way
> of representing it in a file that can be linked in the index.html head
> with a <link rel="spine" href="" type=""> where the type indicates the
> format used to describe the behaviour (my suggestion would be JSON).

I think that this mechanism might work well for LoTs, LoIs,
bibliographies, etc.

> In fact, I'd suggest that a lot of the features people demand should be
> community-led extensions that describe either specific schemas for
> microdata/RDFA lite or a JSON format.
>
> IMNSHO, the base format should, on its own, be laughably ridiculously
> brain-dead simple to create and use and have no major features that
> aren't a part of what browsers support anyway. Like Daniel said in his
> requirements email, making an ebook shouldn't be more complicated than
> zipping up a website.

I agree.

AlPe

Alberto Pettarin

unread,

Feb 28, 2013, 2:43:15 PM2/28/13

to epu...@googlegroups.com

On 02/28/2013 08:30 PM, Dave Cramer wrote:
> Do you think index.html with required nav meets that test, or comes
> close enough for our purposes?

In my opinion, this is a good trade-off between simplicity and flexibility.

The only simpler idea I can think of is
> to have no "glue" file at all, and just have the reading system open the
> docs in the zip in order. But then the order would be
> language-dependant, and we'd argue just as much... :)

This approach looks very fragile, and not every zip library out there
has a function for listing the zip content in order (yeah, lazy coders...).

(Apologies for previous email, that I wrote nearly concurrently with
Dave's one.)

AlPe

Baldur Bjarnason

unread,

Feb 28, 2013, 3:25:49 PM2/28/13

to epub-ng

On 28 Feb 2013, at 19:30, Dave Cramer <dau...@gmail.com> wrote:

>
> I've been pushing on this issue a bit because an implementation is in the works (not from me!) which deals with complex educational materials, where such divergence is deemed important.
>

Cool.

>
> Do you think index.html with required nav meets that test, or comes close enough for our purposes? The only simpler idea I can think of is to have no "glue" file at all, and just have the reading system open the docs in the zip in order. But then the order would be language-dependant, and we'd argue just as much... :)
>

An index.html file with a required nav does meet that test. I just think that defining a reading order that deviates from that defined in that nav outline (no matter whether it does so by defining a second outline or by hiding elements in the base outline) can get very complicated and should be out of scope for the base format.

Implementers should feel free to try and get community consensus for extensions that work with dumb implementations of the base format but that should be independent from the base format. I think a strict rule about a narrow scope is the only way we can keep the base format simple. Otherwise things will get very messy, very quickly.

On 28 Feb 2013, at 19:34, Alberto Pettarin <pett...@gmail.com> wrote:
> I like the latter approach, but --- if I understand your proposal correctly --- I have an objection.
>
> Let me observe that "reading order" (I propose to call it "default resource order", since it might be a list of MP3 files) and "ToC" are:
> a) from an abstract point of view, two sufficiently simple, distinct concepts, and they are implied by a definition of "book" that covers a large spectrum of different "contents" (fiction, manuals, school books, audioebooks, etc.); and
> b) from a practical point of view, very convenient to have in e0, since they are very likely to be used by producers and, more importantly, to be required by (human) readers.
>
> Hence, I second the proposal of having the opportunity in e0 to explicitly declare them (in an easy way, both to write and to parse), like in Dave's first proposal, which, by the way, let us define a ToC/"default resource order" not limited to HTML pages.

Well, if it has to be done, then I prefer having a separate outline where the root nav element is marked in some way as being the spine or playlist (as opposed to marking individual elements in the outline). Whether that's done using a data- attribute, epub- attribute, microdata, or RDFa Lite doesn't matter to me, largely because it lets most users ignore the feature completely. I only ask that we pick one mechanism for e0 metadata and stick to using that everywhere necessary.

- best
- baldur

Hadrien Gardeur

unread,

Feb 28, 2013, 3:31:47 PM2/28/13

to Baldur Bjarnason, epub-ng

An index.html file with a required nav does meet that test. I just think that defining a reading order that deviates from that defined in that nav outline (no matter whether it does so by defining a second outline or by hiding elements in the base outline) can get very complicated and should be out of scope for the base format.

Implementers should feel free to try and get community consensus for extensions that work with dumb implementations of the base format but that should be independent from the base format. I think a strict rule about a narrow scope is the only way we can keep the base format simple. Otherwise things will get very messy, very quickly.

Keep in mind that the alternate navigation would be optional.

If your reading order is the same as your navigation, then use a single OL list in index.html and you're fine.

Well, if it has to be done, then I prefer having a separate outline where the root nav element is marked in some way as being the spine or playlist (as opposed to marking individual elements in the outline). Whether that's done using a data- attribute, epub- attribute, microdata, or RDFa Lite doesn't matter to me, largely because it lets most users ignore the feature completely. I only ask that we pick one mechanism for e0 metadata and stick to using that everywhere necessary.

I strongly believe that we need it as an option. We're not just talking about complex textbooks, many basic fiction books also need to have a reading order that is slightly different from their ToC (Dave provided a good example on the EPUB Zero blog).

I belive that two separate lists in such cases, with a simple type attribute to identify them is the better option (hiding elements sounds horrible and remind me of the current linear="no" mess).

Hadrien

Baldur Bjarnason

unread,

Feb 28, 2013, 5:09:13 PM2/28/13

to epub-ng

On 28 Feb 2013, at 20:31, Hadrien Gardeur <hadrien...@feedbooks.com> wrote:
> Keep in mind that the alternate navigation would be optional.
> If your reading order is the same as your navigation, then use a single OL list in index.html and you're fine.
>

> I strongly believe that we need it as an option. We're not just talking about complex textbooks, many basic fiction books also need to have a reading order that is slightly different from their ToC (Dave provided a good example on the EPUB Zero blog).
>
> I belive that two separate lists in such cases, with a simple type attribute to identify them is the better option (hiding elements sounds horrible and remind me of the current linear="no" mess).
>
> Hadrien

Okay. The consensus is clearly that we need this. In that case I agree with you that we should have two separate lists with a simple type attribute to identify and that we should avoid the hiding elements approach.

So, +1 on that :-)

- best
- baldur

Florian Rivoal

unread,

Mar 1, 2013, 6:36:27 AM3/1/13

to epub-ng

On Thu, 28 Feb 2013 23:09:13 +0100, Baldur Bjarnason
<baldur.b...@gmail.com> wrote:

> Okay. The consensus is clearly that we need this. In that case I agree
> with you that we should have two separate lists with a simple type
> attribute to identify and that we should avoid the hiding elements
> approach.
>
> So, +1 on that :-)

Sorry for jumping in late, I just joined the list.

I'd like to argue in favor of link@rel=next a little.

Browsers as they are today[1] can use this information to navigate to the
next chapter once the end of the current one is reached. If we use a <ol>
inside a <nav> in index.html, it means that reading the book with a
browser will be somewhat disruptive of the reading experience, requiring
the user to navigate back to the index and then to the following chapter.
Of course, extensions could be built to deal with it, but they don't exist
as of now.

Besides, they would require more state management. If the next chapter is
indicated by link@rel=next, you can find it from the page you're on, but
if it comes from a structure in index.html, you have to keep that in
memory, or reload index.html. Reloading is obviously bad for performance,
especially if the book is being read over http rather than from the zip
file. But keeping it in memory could be in conflict with HTTP's caching
and expiry headers. Also HTTP is essentially a stateless protocol, and the
web has been built on this assumption. I am not very comfortable
introducing things that make the behavior in one page (pressing next at
the end of a chapter) depend on something that was in another page
(index.html). For example, imagine a piece of javascript in index.html
that modifies the nav/ol periodically (maybe this ebook is a self updating
newspaper). When you are reading chapter 3 and press next, what happens?
Do you go to what was after chapter 3 in index.html last time you loaded
it? Do you keep index.html's javascript environment running in the
background, so that javascript can do its job, and follow whatever is the
current link? Keeping navigation in a separate file makes this kind of
mess possible, and requires us to answer these questions. I'd rather not
go there if we can avoid it.

An ol in a nav in index.html does sound appropriate for the non reading
order ToC. No magic behavior is necessary for this one: a dedicated ebook
reader may want to present it in a fancy way, but just displaying the html
when the user invokes the index is good enough.

Other downsides were brought up for link@rel=next. I think they are real,
but not show stoppers. Let me address the ones I've noticed.

* link@rel=next doesn't work with non HTML chapters. This is true, but it
can easily be worked around. Most of the content we would want to work
with (from videos to vector graphics to plain images) can be embedded in
HTML, making the point moot. For the rest, this is a problem for the web
as much as for ebooks, and should be addressed by working with the folks
specifying html.

* Centralizing the information in index.html makes modifications a lot
easier, and allows sharing chapters across books. For purely manual
edition, this is true. However, a trivial template and build system can
take care of this.

There are also (currently unimplemented) proposals[2] to be able to style
in css the transition (slide, flip page, fade...) from one page to the
other when following link@rel=next and friends. I don't expect this
particular proposal to be particularly successful as is, but if we go some
other route, we will miss such improvements to the web platform when they
materialize.

All in all, I don't think that link@rel=next is perfect, but it might just
be good enough for us not to need to invent some other structure or
behavior distinct from the web.

- Florian

[1] Following link@rel=next to jump to the next page if you hit space
after scrolling down to the bottom of the current one is built into Opera.
Firefox and Chrome have a bunch of addons for the same thing, like this
one https://addons.mozilla.org/en-US/firefox/addon/space-next/ or this one
https://chrome.google.com/webstore/detail/next-page/ioogigbgjmadikfocfdmmdlghogaehca

[2] http://www.w3.org/TR/css3-gcpm/#page-shift-effects

Kjartan Müller

unread,

Mar 1, 2013, 6:36:59 AM3/1/13

to epu...@googlegroups.com, Kjartan Müller

There seems to be emerging some consensus on having two lists, so I'm not sure if I really want to press this, but here it goes.

My reason for suggesting microdata is that it lets us have two lists on different levels of abstraction. One list in HTML for rendering and user navigation, and one list defined by metadata to be handled by the RS. So it is really not the case of one list with opt-in or opt-out for reading order, but two layered more or less on top of each other. That would mean that

<li><a itemprop="section" href="file1.html">Chapter 1</a></li>

<li><a itemprop="section" href="file2.html">Chapter 2</a></li>

<li><a itemprop="section" href="file3.html">Chapter 3</a></li>

</ol>

</nav>

could be considered somewhat equal with

<li><a href="file1.html">Chapter 1</a></li>

<li><a href="file2.html">Chapter 2</a></li>

<li><a href="file2.html#quiz">Chapter 2 - quiz</a></li>

<li><a href="file3.html">Chapter 3</a></li>

<li><a href="...">Final assessment</a></li>

</ol>

</nav>

and in a single-file-epub something like:

<nav>

<li><a href="#chapter1">Chapter 1</a></li>

<li><a href="#chapter2">Chapter 2</a></li>

<li><a href="#quiz">Chapter 2 - quiz</a></li>

<li><a href="#chapter3">Chapter 3</a></li>

<li><a href="...">Final assessment</a></li>

</ol>

<nav>

</body>

I think this still could meet Daniels basic requirements, so it might be worth considering. But I'm not sure myself.

Kjartan

Hadrien Gardeur

unread,

Mar 1, 2013, 7:00:14 AM3/1/13

to Florian Rivoal, epub-ng

I'd like to argue in favor of link@rel=next a little.

Browsers as they are today[1] can use this information to navigate to the next chapter once the end of the current one is reached. If we use a <ol> inside a <nav> in index.html, it means that reading the book with a browser will be somewhat disruptive of the reading experience, requiring the user to navigate back to the index and then to the following chapter. Of course, extensions could be built to deal with it, but they don't exist as of now.

Besides, they would require more state management. If the next chapter is indicated by link@rel=next, you can find it from the page you're on, but if it comes from a structure in index.html, you have to keep that in memory, or reload index.html. Reloading is obviously bad for performance, especially if the book is being read over http rather than from the zip file. But keeping it in memory could be in conflict with HTTP's caching and expiry headers. Also HTTP is essentially a stateless protocol, and the web has been built on this assumption. I am not very comfortable introducing things that make the behavior in one page (pressing next at the end of a chapter) depend on something that was in another page (index.html). For example, imagine a piece of javascript in index.html that modifies the nav/ol periodically (maybe this ebook is a self updating newspaper). When you are reading chapter 3 and press next, what happens? Do you go to what was after chapter 3 in index.html last time you loaded it? Do you keep index.html's javascript environment running in the background, so that javascript can do its job, and follow whatever is the current link? Keeping navigation in a separate file makes this kind of mess possible, and requires us to answer these questions. I'd rather not go there if we can avoid it.

An ol in a nav in index.html does sound appropriate for the non reading order ToC. No magic behavior is necessary for this one: a dedicated ebook reader may want to present it in a fancy way, but just displaying the html when the user invokes the index is good enough.

I see where you're coming from (REST and in particular HATEOAS) but it doesn't really apply here, especially since you admit yourself than a centralized ToC is necessary anyway.

The difference between a publication and a set of Web resources, is essentially the "glue", how we link these document together (reading order and ToC) and what they represent when we group them together (publication's metadata). To be truely stateless, not only would you have to rely on reading order discovery (link@rel="next"), you'd also have to repeat the ToC and the publication's metadata too.

What you're advocating for is a mixed model where some information is stocked between states (publication metadata, ToC) and others are discovered whenever we fetch a resource (reading order).

It doesn't make much sense.

Other downsides were brought up for link@rel=next. I think they are real, but not show stoppers. Let me address the ones I've noticed.

* link@rel=next doesn't work with non HTML chapters. This is true, but it can easily be worked around. Most of the content we would want to work with (from videos to vector graphics to plain images) can be embedded in HTML, making the point moot. For the rest, this is a problem for the web as much as for ebooks, and should be addressed by working with the folks specifying html.

Forcing content creators to embed everything in HTML seems like an even worse restriction to me than what we currently have in EPUB3 (XHTML or SVG).

For a 100 pages long comics, you'd need a 100 HTML documents that will contain little information at all, just for the sake of supporting something that can't be stateless anyway. If we were reading stricly on the Web (and not in a zip container), one could argue that the HTTP link header is a good fit for that, but that's not our use case.

Hadrien

Alberto Pettarin

unread,

Mar 1, 2013, 7:36:47 AM3/1/13

to epu...@googlegroups.com

On 03/01/2013 12:36 PM, Florian Rivoal wrote:
> Browsers as they are today[1] can use this information to navigate to
> the next chapter once the end of the current one is reached. If we use a
> <ol> inside a <nav> in index.html, it means that reading the book with a
> browser will be somewhat disruptive of the reading experience, requiring
> the user to navigate back to the index and then to the following
> chapter. Of course, extensions could be built to deal with it, but they
> don't exist as of now.

I guess that just the requirement of having a ZIP container will imply
the need for browser extension. However I am not a browser expert, and I
might be wrong on this, feel free to correct me.

> * Centralizing the information in index.html makes modifications a lot
> easier, and allows sharing chapters across books. For purely manual
> edition, this is true. However, a trivial template and build system can
> take care of this.

Sure, but the first option allows authors to use zip and any text editor
they like, plus it is trivial to automate the assembly of "large"
documents. The second option (using link@rel and a "specialized" tool)
still slightly limits the authoring choices. (I admit, this is a minor
point.)

> There are also (currently unimplemented) proposals[2] to be able to
> style in css the transition (slide, flip page, fade...) from one page to
> the other when following link@rel=next and friends. I don't expect this
> particular proposal to be particularly successful as is, but if we go
> some other route, we will miss such improvements to the web platform
> when they materialize.

Can't these niceties co-exist in e0 with <nav>-specified default
resource order?
Assume your e0 file has a "default resource order" specified in
index.html via a <nav> list, but you authored the HTML pages s.t. they
contain link@rel. We might decide that, if link@rel are present, they
supersede the <nav> order. This way, you are even free to create an
empty <nav> list and define the order by using link@rel.
(I am not entirely convinced by this mechanism, but I think it might be
be a good compromise.)

AlPe

Baldur Bjarnason

unread,

Mar 1, 2013, 9:48:05 AM3/1/13

to epub-ng

On 1 Mar 2013, at 11:36, Florian Rivoal <flo...@rivoal.net> wrote:

> On Thu, 28 Feb 2013 23:09:13 +0100, Baldur Bjarnason <baldur.b...@gmail.com> wrote:
>
>
> Sorry for jumping in late, I just joined the list.
>
> I'd like to argue in favor of link@rel=next a little.

I'm pretty much in agreement with Hadrian on link@rel=next. I'm against it for much the same reasons.

It is neat, elegant, and more than a little bit clever, but it's also a little bit too neat, elegant, and clever.

One of the areas where it causes problems is in format conversions, e.g. from e0 to EPUB. Parsing a bunch of HTML5 files to re-serialise them into XHTML5 is easy for most tools, but having to assemble a reading order by reading the link tags of all of the files before writing out the converted EPUB is a pain. If you make the link@rel=next optional you make the process even more complicated.

- best
- baldur

Bill McCoy

unread,

Mar 1, 2013, 9:55:16 AM3/1/13

to Florian Rivoal, epub-ng

Browsers as they are today[1] can use this information to navigate to the next chapter once the end of the current one is reached....

That is a misleading statement.

Browsers with extensions added can handle today's EPUB too (e.g. Readium) so that's beside the point.

*Opera* can handle navigating through chained documents with link@rel=next today. And I believe that's only because it was part of Hakon's personal experiment with browser pagination and so on. That is a datapoint but nothing to do with the core requirement Daniel laid out about native browser support (assuming more than 3% of browser usage coverage is desired).

--Bill

Florian Rivoal

unread,

Mar 1, 2013, 3:52:26 PM3/1/13

to Bill McCoy, epub-ng

On Fri, 01 Mar 2013 15:55:16 +0100, Bill McCoy <whm...@gmail.com> wrote:

>> Browsers as they are today[1] can use this information to navigate to
>> the
>> next chapter once the end of the current one is reached....
>
>

> Browsers with extensions added can handle today's EPUB too (e.g. Readium)
> so that's beside the point.
>
> *Opera* can handle navigating through chained documents with
> link@rel=next today.
> And I believe that's only because it was part of Hakon's personal
> experiment with browser pagination and so on.
>
> That is a datapoint but
> nothing to do with the core requirement Daniel laid out about native
> browser support (assuming more than 3% of browser usage coverage is
> desired).

Actually, opera has been able to do that long before Hakon's semi-recent
experiments with in-browser pagination, but I'll grant you the point about
Opera's tiny market share, especially since Opera-as-we-know-it is being
phased out.

However, my point does not completely go away: this is so much in line
with the intended semantics of link@rel=next that browsers may introduce
direct support for navigating through them, independently of what we do
with epub-ng.

Which brings an interesting point. If we use an ol in index.html rather
thhan link@rel=next to indicate the next chapter in the order of reading,
what happens if a document also sets link@rel=next (and sets it to
something different) in a browser that supports opera-style direct
navigation of link@rel=next? Also, what about crawlers and other things
that already understand link@rel=next?

- Florian

Florian Rivoal

unread,

Mar 1, 2013, 4:33:28 PM3/1/13

to Hadrien Gardeur, epub-ng

On Fri, 01 Mar 2013 13:00:14 +0100, Hadrien Gardeur
<hadrien...@feedbooks.com> wrote:

> I see where you're coming from (REST and in particular HATEOAS) but it
> doesn't really apply here, especially since you admit yourself than a
> centralized ToC is necessary anyway.

The centralized ToC that I think we should retain in addition to
link@rel=next navigation is the one that doesn't necessarily follow
reading order. This one needs to be explicit to provide access to all
parts of the book, and needs to be independent of whatever it is we use to
set the reading order, since it doesn't have to follow it.

> The difference between a publication and a set of Web resources, is
> essentially the "glue", how we link these document together (reading
> order
> and ToC) and what they represent when we group them together
> (publication's
> metadata). To be truely stateless, not only would you have to rely on
> reading order discovery (link@rel="next"), you'd also have to repeat the
> ToC and the publication's metadata too.
>
> What you're advocating for is a mixed model where some information is
> stocked between states (publication metadata, ToC) and others are
> discovered whenever we fetch a resource (reading order).
> It doesn't make much sense.

That's not the way I see it. Of course, if the book is split into several
files, you won't have all the information about everything without going
through all the files, and if we store publication metadata in the
index.html, then you have to load that to read it. As long as we're
considering a format made of several files, this is inevitable, and I
don't see that as a mixed model.

However, this is not what I referred to when I said state. I was thinking
of the information that the ebook UA must have in memory to be able to
interact with the book and respond to user actions, and in particular,
navigation commands. With link@rel=next, you can respond to the "turn
page" action only based on information contained in the page. With an ol
in index.html, you can't.

> Forcing content creators to embed everything in HTML seems like an even
> worse restriction to me than what we currently have in EPUB3 (XHTML or
> SVG).
> For a 100 pages long comics, you'd need a 100 HTML documents that will
> contain little information at all,

I agree this is inconvenient, but I'd be willing to live with it though.
While I am fine with adding new mechanisms for things that are completely
missing from the vanilla web platform, I am less enthusiastic about adding
mechanisms for things that are there, but that we deem imperfect. First,
for simplicity's sake. Second, because having two mechanism in place for
similar but subtly different things risks causing weird behavior when both
are used simultaneously. Also, because the web platform may gradually add
more functionality on top of its existing building blocks, and if we went
some other route, we'll either miss out, or have to duplicate it.

> just for the sake of supporting
> something that can't be stateless anyway.

See above for what I meant by stateless.

As I mentioned, I see a problem due to the statefulness of <ol> in
index.html. Here are 4 possible ways we could want to access that state:

- when opening the book, open index.html, extract the information about
the reading order and the ToC from the <ol> without running any javacript
first, and remember that forever.

- when opening the book, open index.html, run whatever javascript is
associated with the onload event, extract the information about the
reading order and the ToC from the <ol>, and remember that forever.

- whenever a navigation command (next, previous, etc) is issued, open
index.html, run the whatever javascript is associated with the onload
event, extract the relevant info from <ol>, act on it, and forget it.

- when opening the book, load index.html in a persistent environment, with
an active DOM, and let any javascript embedded in the page run in it for
as long as it wants to. Whenever a navigation command (next, previous,
etc) is issued, open index.html, extract the relevant info from the
current version of <ol> in the DOM, act on it, and forget it.

Because of the ability of javascript to act on the <ol> through the DOM,
these 4 methods can all yield different results. In the vast majority of
books, this won't be an issue, but it can be, so we have to specify which
way we want it.

It can be dealt with, but I'd rather not have to.

- Florian

Baldur Bjarnason

unread,

Mar 1, 2013, 4:28:39 PM3/1/13

to epub-ng

On 1 Mar 2013, at 20:52, Florian Rivoal <flo...@rivoal.net> wrote:
> Actually, opera has been able to do that long before Hakon's semi-recent experiments with in-browser pagination, but I'll grant you the point about Opera's tiny market share, especially since Opera-as-we-know-it is being phased out.
>
> However, my point does not completely go away: this is so much in line with the intended semantics of link@rel=next that browsers may introduce direct support for navigating through them, independently of what we do with epub-ng.
>
> Which brings an interesting point. If we use an ol in index.html rather thhan link@rel=next to indicate the next chapter in the order of reading, what happens if a document also sets link@rel=next (and sets it to something different) in a browser that supports opera-style direct navigation of link@rel=next? Also, what about crawlers and other things that already understand link@rel=next?
>
> - Florian

I think they were referring to the fact that Opera's link@rel=next support was one of their much much earlier navigation experiments, predating the pagination experiments by quite a few years.

link@rel=next dates back to the early days of HTML (for example, from HTML4 http://www.w3.org/TR/html4/types.html#type-links although it's even older than that, here the HTML3.2 reference http://www.w3.org/TR/REC-html32#link and you can find discussions about it online as a part of the HTML2 feature set even though it wasn't an official part of the spec).

It's been around for ages. If browsers were going to implement direct navigation support for it, they would have done so by now.

The bug for adding navigation support for next/prev links in bugzilla dates back to 1999 (https://bugzilla.mozilla.org/show_bug.cgi?id=2800). After fourteen years of work, I think it's pretty clear that it isn't happening. Also, as far as I can tell there is no bug for this in the Chromium open issues (https://code.google.com/p/chromium/issues/list).

UI and navigation support for the various link types used to be much more common, especially among alternative browsers. Mozilla (way way before Firefox née Phoenix) used to have an optional link bar, I think, and it was a big feature of iCab's and a few other minority browsers at the time. It's a feature on the retreat, not advancing, mainly used for SEO.

This is also a non-issue on tablets and phones. I don't see mobile browser vendors either adding additional UI for link types or adding keyboard navigation support for them.

And even if they do add support for it, I still don't think it matters. That just means that authors can use link@rel=next to override the reading order when the ebook is being read in a browser with no understanding or support for e0. You could even argue that it'd be a feature, not a bug ;-)

- best
- baldur

Florian Rivoal

unread,

Mar 1, 2013, 4:39:29 PM3/1/13

to epu...@googlegroups.com

On Fri, 01 Mar 2013 22:28:39 +0100, Baldur Bjarnason
<baldur.b...@gmail.com> wrote:

I see the reasoning behind most of what you said, but let me answer a
couple of points.

> UI and navigation support for the various link types used to be much
> more common, especially among alternative browsers. Mozilla (way way
> before Firefox née Phoenix) used to have an optional link bar, I think,
> and it was a big feature of iCab's and a few other minority browsers at
> the time. It's a feature on the retreat, not advancing, mainly used for
> SEO.

You're right. But the way I see it, this is because documents have been in
retreat in favor of applications. As we're pushing for ebooks, this is a
trend we're hoping to counter, at least partly.

> This is also a non-issue on tablets and phones. I don't see mobile
> browser vendors either adding additional UI for link types or adding
> keyboard navigation support for them.

No buttons for sure, but I could definitely see gestures.

Florian Rivoal

unread,

Mar 1, 2013, 4:56:38 PM3/1/13

to epu...@googlegroups.com

On Fri, 01 Mar 2013 13:36:47 +0100, Alberto Pettarin <pett...@gmail.com>
wrote:

> I guess that just the requirement of having a ZIP container will imply
> the need for browser extension. However I am not a browser expert, and I
> might be wrong on this, feel free to correct me.

If you try opening the zip in a browser, yes, browsers would need to be
extended to read it directly. But I'd like the format to have as little
magic as possible, so that if you unzip the zip in a regular folder (or
behind a static page web server) and point a browser at it, most things
should still work.

For things that can't be done with regular web technologies in a regular
browser (and are deemed necessary), we'll need to invent something, but
hopefully there shouldn't be many of these.

>> There are also (currently unimplemented) proposals[2] to be able to
>> style in css the transition (slide, flip page, fade...) from one page to
>> the other when following link@rel=next and friends. I don't expect this
>> particular proposal to be particularly successful as is, but if we go
>> some other route, we will miss such improvements to the web platform
>> when they materialize.
>
> Can't these niceties co-exist in e0 with <nav>-specified default
> resource order?
> Assume your e0 file has a "default resource order" specified in
> index.html via a <nav> list, but you authored the HTML pages s.t. they
> contain link@rel. We might decide that, if link@rel are present, they
> supersede the <nav> order. This way, you are even free to create an
> empty <nav> list and define the order by using link@rel.
> (I am not entirely convinced by this mechanism, but I think it might be
> be a good compromise.)

If we define how our navigation works after these mechanisms have been
defined, sure, we can make it work with them. But not the other way
around, since they won't consider our thing when defining theirs, and may
spec it in an incompatible way. Any time we chose to invent a better
alternative to something that's already present in the web platform, we
not only give up on what the existing functionality is, but also on all
future improvements.

- Florian

Baldur Bjarnason

unread,

Mar 1, 2013, 5:27:02 PM3/1/13

to Florian Rivoal, epu...@googlegroups.com

On 1 Mar 2013, at 21:39, "Florian Rivoal" <flo...@rivoal.net> wrote:

> On Fri, 01 Mar 2013 22:28:39 +0100, Baldur Bjarnason <baldur.b...@gmail.com> wrote:
>
> I see the reasoning behind most of what you said, but let me answer a couple of points.
>
>> UI and navigation support for the various link types used to be much more common, especially among alternative browsers. Mozilla (way way before Firefox née Phoenix) used to have an optional link bar, I think, and it was a big feature of iCab's and a few other minority browsers at the time. It's a feature on the retreat, not advancing, mainly used for SEO.
>
> You're right. But the way I see it, this is because documents have been in retreat in favor of applications. As we're pushing for ebooks, this is a trend we're hoping to counter, at least partly.

I was countering the argument that we should use link@rel=next because that's the way browsers were heading. It isn't.

Link@rel=next has issues that I find problematic enough to turn me against it beyond the continuing lack of support by browsers:

* Having to parse all of the book's HTML files to build a full reading order.
* Making images, movies, audio, SVG and the like second class citizens that have to be embedded in HTML.
* The authorship problems with link@rel=next aren't nearly as much of a non-issue as you think. For a lot of users there is no such thing as trivial templates or build systems.

Any one of these issues is a deal-breaker in my opinion.

If you have concerns about javascript changing the structure of the <ol> the simplest solution is either to use a static data format, such as JSON, to dictate the reading order. Alternatively we could simply require that the reading order be the order laid out in the markup before any JS is run, or that it be the order that remains after the initial onload. Pick one scenario and make it standard. Personally I'd just say that it should be the pre-js markup order of elements that rules the day.

>
>
>> This is also a non-issue on tablets and phones. I don't see mobile browser vendors either adding additional UI for link types or adding keyboard navigation support for them.
>
> No buttons for sure, but I could definitely see gestures.

I seriously doubt that. Gestures are already both way overloaded and nigh undiscoverable on most mobile platforms and the current trend is towards APIs allowing web authors to recognise gestures themselves with javascript, not towards built-in gestures for navigation. And even when they do add built-in gestures, those are usually a part of the accessibility feature and not enabled by default. And even if they do enable added navigation gestures there, link@rel=next specifically isn't likely to be on any browser vendor's radar anyway, who are focusing on fleshing out support for ARIA and the new HTML5 semantic elements.

They've had two decades now and link@rel=next is a feature that's not likely to be making a comeback in browsers.

- best
- baldur

Hadrien Gardeur

unread,

Mar 1, 2013, 5:29:38 PM3/1/13

to Florian Rivoal, epub-ng

However, this is not what I referred to when I said state. I was thinking of the information that the ebook UA must have in memory to be able to interact with the book and respond to user actions, and in particular, navigation commands. With link@rel=next, you can respond to the "turn page" action only based on information contained in the page. With an ol in index.html, you can't.

I understand perfectly what you mean by state, and it is still a mixed model.

If you really wanted to be stateless, then each ressource would have to include a link@rel=next, but also a link@rel=prev (to respond to the previous page action) and a link@rel=package (or toc, or whatever we'd decide to call it) to access the index.html.

If you retain the information about the index.html then this is not a stateless model at all.

Anyway, I don't think that this is a good model for a packaged document format.

Alberto Pettarin

unread,

Mar 1, 2013, 5:40:25 PM3/1/13

to epu...@googlegroups.com

On 03/01/2013 11:27 PM, Baldur Bjarnason wrote:
> Link@rel=next has issues that I find problematic enough to turn me against it beyond the continuing lack of support by browsers:
>
> * Having to parse all of the book's HTML files to build a full reading order.
> * Making images, movies, audio, SVG and the like second class citizens that have to be embedded in HTML.
> * The authorship problems with link@rel=next aren't nearly as much of a non-issue as you think. For a lot of users there is no such thing as trivial templates or build systems.
>
> Any one of these issues is a deal-breaker in my opinion.

While there might be a workaround for the second and the third (although
disregarding them leads to a certain degree of inconvenience, both at
authoring and at run time), the first one is a deal-breaker for me as well.
I have authored EPUB ebooks with 2,000+ XHTML pages, and the idea that a
user agent has to parse the header of all of them just to present the
user with a ToC (which is a reasonable requirement) seems like a nightmare.

AlPe

Hadrien Gardeur

unread,

Mar 1, 2013, 5:43:29 PM3/1/13

to Florian Rivoal, epub-ng

If you really wanted to be stateless, then each ressource would have to include a link@rel=next, but also a link@rel=prev (to respond to the previous page action) and a link@rel=package (or toc, or whatever we'd decide to call it) to access the index.html.

You can also imagine some of the side effects here.

User is reading document 1 which links to document 2 as the next document. User decides to get to the next document and the RS opens document 2. Unfortunately for the user, document 2 links to document 3 as its previous document. If the user decides to go back, instead of getting document 1, it's document 3 that gets displayed.

Do we want to deal with this kind of crap ? HELL NO !

Hadrien Gardeur

unread,

Mar 1, 2013, 6:01:44 PM3/1/13

to Florian Rivoal, epub-ng

One final note, if you want to be stateless, yet use the index.html to store the reading order, ToC and metadata it's fairly straghtforward. Just include a link to the index.html in all your documents with the proper rel (and thanks to HTTP headers, this is not limited to HTML documents).

Think of your index.html as a service discovery document, that gives you all the interactions that you can have (what's next, what's before, how can I skip directly to another document using the ToC).

This could be the one and only requirement for a container-less version of EPUB NG (this is probably quite similar to what Bill had in mind when he wanted to use RSS/Atom for the spine).

Daniel Glazman

unread,

Mar 2, 2013, 7:57:38 AM3/2/13

to epu...@googlegroups.com

On 01/03/13 13:36, Alberto Pettarin wrote:

> I guess that just the requirement of having a ZIP container will imply
> the need for browser extension. However I am not a browser expert, and I
> might be wrong on this, feel free to correct me.

https://wiki.mozilla.org/WebAPI/ArchiveAPI

</Daniel>

Bill McCoy

unread,

Mar 2, 2013, 10:06:41 AM3/2/13

to Hadrien Gardeur, Florian Rivoal, epub-ng

Yes what I'd been musing about lately for unpackaged "networked publications" was the OPF file as the "service discovery document". What I sense this group is converging to wrt "index.html" is the merging of OPF into Navigation Document which I think would be fantastic if it could be pulled off, and arguably consistent w/ the step already taken in EPUB 3.0 of moving TOC from special XML schema (NCX) to HTML microdata. Nav Doc as root is certainly more HATEOAS than OPF. OTOH, JSON would seem au courant for this kind of thing and I'm sure using HTML would not be universally loved.

One fly in the ointment is if encryption needs to be supported, as in that case from a practical perspective a current EPUB publication has two roots, you need both the OPF (pointed to by container.xml) and encryption.xml. I'm presuming this group won't care about DRM (and in an unpackaged scenario we may have https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html ) but font obfuscation is potentially a consideration.

Another fly in the ointment is if multiple renditions need to be supported. This is important for manga, digital magazines and other fixed-layout scenarios not just for combining several fixed layouts but also allowing a reflowable rendition to be there as well, for accessibility. In that case the dual roots of an EPUB today are container.xml + encryption.xml (since the container.xml can point to multiple opf's).

Lastly the seeming failure of the HTML5 "cache manifest" is cautionary: W3C seems to be backing away from it entirely and Google and Netscape and MSFT seem to be going with packaged web apps. Radicals like me were advocating for RSS back when we did EPUB 2 but in EPUB 3 the cache manifest was the bright shiny object. I haven't heard from W3C folks what the specific issues are but Daniel may know. I know when Google launched their new Packaged Apps at IO last year they motivated it by dinging online web apps inc. cache manifest by talking about limitations of their own Google Reader - http://www.youtube.com/watch?v=j8oFAr1YR-0 starting at about 3:00.

Stepping back if you guys sort out networked (unpackaged, distributed, potentially dynamic) publications I think that would be a fantastic result. Today's EPUB is not used in any standard way in a unpackaged/networked scenario so it's a "greenfield" opportunity.

--Bill

Baldur Bjarnason

unread,

Mar 2, 2013, 12:58:31 PM3/2/13

to epub-ng

On 2 Mar 2013, at 15:06, Bill McCoy <whm...@gmail.com> wrote:

> Yes what I'd been musing about lately for unpackaged "networked publications" was the OPF file as the "service discovery document". What I sense this group is converging to wrt "index.html" is the merging of OPF into Navigation Document which I think would be fantastic if it could be pulled off, and arguably consistent w/ the step already taken in EPUB 3.0 of moving TOC from special XML schema (NCX) to HTML microdata. Nav Doc as root is certainly more HATEOAS than OPF. OTOH, JSON would seem au courant for this kind of thing and I'm sure using HTML would not be universally loved.

The reason why HTML is more appropriate in this case is that the index.html file serves both the reader and the machine and has to be readable by both. It also ties in with a long-standing tradition of index and ToC html files in websites. JSON would not serve the requirements.

>
> One fly in the ointment is if encryption needs to be supported, as in that case from a practical perspective a current EPUB publication has two roots, you need both the OPF (pointed to by container.xml) and encryption.xml. I'm presuming this group won't care about DRM (and in an unpackaged scenario we may have https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html ) but font obfuscation is potentially a consideration.

Actually, Hadrian has already covered the DRM issue: DRMed files should be a different filetype. As far as I understand, one big issue he (and others) has been facing is the fact that encrypted and unencrypted EPUBs are indistinguishable from the outside. Everybody's lives would be made easier if encrypted EPUB Zeros were a different format with a different mimetype and a different filename suffix (i.e. .ee0). In that case the encrypted version could use different packaging from the unencrypted one, a packaging that can be designed at a later date without holding .e0 up. That would let encrypted EPUB Zeros have two roots where one of the roots is identical to that of an unencrypted e0 and the other one holds the encryption data.

Regarding font obfuscation (a problem I've been obsessing about for a while now and talking about with a few people I know):

It hardly ever works.

Liz Castro, one of the industry's foremost experts on EPUB, constantly runs into problems getting it to work. Others frequently see it break in DRMed files. It seems to be fiddly in both iBooks and in Adobe's systems, sometimes working, sometimes not, often breaking for a variety of reasons, at the slightest change. Tools to deal with it are few and far between and those that are available only come in big problematic packages (Indesign, Sigil). Font obfuscation as a ritual for blessing fonts for embedding is like sacrificing a chicken to the gods so that they can make that chicken lay more eggs. You only end up with chicken guts, no eggs, and a bunch of bemused deities.

One approach that comes to mind ties into the fact that many font foundries aren't specific in their licensing about the type or kind of obfuscation they require on their fonts. In any case, rather than opine about this in the dark, I sent an email to Fontspring (the only major seller of ebook font licenses other than Adobe) asking them whether their font foundries would consider applying the same embedding requirements to ebook fonts as they do to web fonts, offering a couple of suggestions along the way. Given that nobody requires fonts openly served on the web to be obfuscated in the same way ebook fonts are, there's a good chance that more than a few foundries can be negotiated into changing their requirements to match the ones they make of web embedding

At the very least, talking to font licensors is something I'd like to try first before anybody decides to make e0 more complicated to support a feature that breaks every time you touch it.

>
> Another fly in the ointment is if multiple renditions need to be supported. This is important for manga, digital magazines and other fixed-layout scenarios not just for combining several fixed layouts but also allowing a reflowable rendition to be there as well, for accessibility. In that case the dual roots of an EPUB today are container.xml + encryption.xml (since the container.xml can point to multiple opf's).

I've come to the opinion that fixed layout HTML chapters in ebooks are an abomination. FXL transforms HTML into a second rate PDF replacement.

Multiple renditions are also very problematic in many cases. Systems that offer similar capabilities, such as publication apps generated by the Adobe toolchain, are often a sign of trying to do the wrong thing (creating inflexible orientation- and hardware-specific designs). I think that going beyond the adaptive capabilities offered by HTML+CSS is a bad idea with pretty negative consequences down the line for the publisher in question (e.g. maintainability, hardware compatibility, workflow complexity).

And even if people disagree with me and still think FXL HTML chapters and multiple renditions are a good idea, or are one of the edge cases that really truly honestly do require multiple renditions (I'm sure there are plenty), then that still doesn't mean e0 should support it.

My impression is that it is perfectly fine if EPUB Zero doesn't solve everybody's problem. In fact, it would be a horrible idea if we tried. Those who need multiple renditions have EPUB3 and the IDPF's work to serve their purposes. It'd be crazy to try and replace EPUB3 for every single use case it serves. You'd only end up with an incompatible EPUB3 clone.

Things I personally think the base EPUB Zero format should *never* even try to address:

* Print compatibility (such as mapping page numbers to HTML positions)
* Fixed layout HTML chapters
* Multiple renditions
* SMIL/media overlays support beyond what browsers plan to support
* SSML or synthetic speech support beyond what browsers plan to support
* Content switching that isn't browser native

If people want to create a community-designed extension to e0 for any one of these things, that's fine, but we shouldn't try to accommodate or incorporate these features in any way and they shouldn't affect our design decisions. Those hypothetical future extensions should just have to work with whatever we end up deciding.

If we want fixed layout, use the appropriate format and include an image or SVG file instead of a FXL HTML abomination. EPUB Zero supports this quite nicely already making it easy to mix fixed layout pages (images and SVG) and flowing pages (HTML).

I'd go so far as to dictate that the default rendering of e0 files matched that of the browser (scrolling) and that, in e0-supporting reading systems, the paginated view had to be selected specially on a book by book basis, so that the reader knows that doing so might compromise the chapter's rendering. But that's a discussion for a later day.

>
> Lastly the seeming failure of the HTML5 "cache manifest" is cautionary: W3C seems to be backing away from it entirely and Google and Netscape and MSFT seem to be going with packaged web apps. Radicals like me were advocating for RSS back when we did EPUB 2 but in EPUB 3 the cache manifest was the bright shiny object. I haven't heard from W3C folks what the specific issues are but Daniel may know. I know when Google launched their new Packaged Apps at IO last year they motivated it by dinging online web apps inc. cache manifest by talking about limitations of their own Google Reader - http://www.youtube.com/watch?v=j8oFAr1YR-0 starting at about 3:00.

This post is one of the more detailed outlines over appcache's problems http://alistapart.com/article/application-cache-is-a-douchebag but it doesn't get into some of the API requirements that outfits like Facebook want from offline web apps.

Despite the appcache's problems (which are numerous) nobody is backing away from it. AFAICT, people are just realising that it only addresses a fraction of the offline problem and are coming up with alternate solutions for the space it doesn't address.

The spec may be flawed but it's very widely implemented (http://caniuse.com/#search=app%20cache). In the case of ebooks, where you are very likely to know what files you are going to be using and the publication is unlikely to be either a fully-fledged app or a website with hundreds of thousands of users (à la Facebook), it actually serves very well.

So, it isn't something I think we should worry about. For the kind of publications we are talking about, a lot of the time we will be able to just use the appcache to provide offline access when hosting ebooks as websites. And in most ebook cases, just offering the .e0 file (or even epub) for download alongside the hosted version is more than enough to solve everybody's problems.

- best
- baldur

Bill McCoy

unread,

Mar 2, 2013, 2:57:51 PM3/2/13

to Baldur Bjarnason, epub-ng

"... HTML is more appropriate in this case is that the index.html file serves both the reader and the machine and has to be readable by both. It also ties in with a long-standing tradition of index and ToC html files in websites. JSON would not serve the requirements."

Baldur, I agree with you and of course this is the direction we already went w/ TOC in EPUB 3.0. But the requirement that the TOC/spine be easily parsed by machine (which was also implicitly part of discussion on rel="next") is in tension with the requirement of being immediately human-presentable information (and if "tag soup" HTML is OK, which addresses the different requirement of being readily hand-authorable, that goes double). No right answer necessarily as standards are always compromises, I was just pointing out that some people will criticise representing data as HTML vs. something like JSON or the plain text that cache manifest and HTTP headers use, particularly if the end result presumes a bit of JS or server-side scripting will in most cases be involved in presenting EPUB NG content.

Re: font obfuscation. EPUB 2 didn't mandate it or font embedding at all, and so the fact that support has been uneven is both unsurprising and irrelevant. It's required in EPUB 3, and increasingly widely supported even in EPUB 2 reading systems. Adobe at least did both business & legal investigation and concluded that font foundries were, with it, comfortable extending the same rights to font embedding for EPUB that they give to PDF (PDF in effect has a poor man's font obfuscation by virtue of its tangled packaging). If an EPUB NG had to convince font vendors that unobfuscated fonts (that could be simply dragged-and-dropped into system font folders on modern OS's with zero add-on SW) were fine that would seem like an extra burden to take on. Or it could simply not define that obfuscated fonts must be supported, but not prevent them either, and let the chips fall where they may. If you're going to propose that every EPUB NG manipulating program must build in 500KB of tag soup HTML parser and not be able to do stream-oriented processing (given tag soup fix-up isn't friendly to xpat-style parsing), a few lines of code to fix up fonts doesn't seem too big an additional burden, and it would only be needed if & when the embedded fonts are needed.

Re: cache manifest being a failure I'll leave it to others to chime in if they have any specifics. I have heard it being referred to lately as a "fiasco". You're right that it got widespread browser support but that doesn't mean it's a good solution. But I really don't know any details. And it anyway wasn't a directly relevant point to this discussion.

Re: pagination not being supported, I get back to the question of what content/experiences you are trying to support. If just to stuff any arbitrary website in a ZIP file, that been done and doesn't need any new specs. If to spec how to handle security and other issues about execution of such arbitrary website in a downloaded context, that's being done right now ( http://www.w3.org/2012/09/sysapps-wg-charter.html ). If to support only Inkling-style experiences, then fine you don't need pagination (maybe, and you'd better slice your content very finely). If really to be an alternative to EPUB for all kinds of publications then certainly pagination is an expected mode of consumption and I would encourage you guys to think through the implications. Very little immersive reading is done via the "next-scroll-scroll-scroll-next" UX of e.g. Safari Books Online (and I think even they have moved on to paged views).

--Bill

Hadrien Gardeur

unread,

Mar 2, 2013, 3:41:15 PM3/2/13

to Bill McCoy, Florian Rivoal, epub-ng

Yes what I'd been musing about lately for unpackaged "networked publications" was the OPF file as the "service discovery document". What I sense this group is converging to wrt "index.html" is the merging of OPF into Navigation Document which I think would be fantastic if it could be pulled off, and arguably consistent w/ the step already taken in EPUB 3.0 of moving TOC from special XML schema (NCX) to HTML microdata. Nav Doc as root is certainly more HATEOAS than OPF. OTOH, JSON would seem au courant for this kind of thing and I'm sure using HTML would not be universally loved.

JSON is just a serialization format, if we adopted a JSON equivalent to index.html we'd have to reinvent the wheel for pretty much everything that we use:

how we express metadata
how we express links
how we display a list
etc.

The only good thing about JSON is that it would be easier to parse and the document would be less complex, but since we expect such publications to be read by a browser or a RS anyway, they already support HTML just fine.

One fly in the ointment is if encryption needs to be supported, as in that case from a practical perspective a current EPUB publication has two roots, you need both the OPF (pointed to by container.xml) and encryption.xml. I'm presuming this group won't care about DRM (and in an unpackaged scenario we may have https://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html ) but font obfuscation is potentially a consideration.

We don't need two roots, index.html could easily reference any other document using a link and proper rel/media type combo.

Another fly in the ointment is if multiple renditions need to be supported. This is important for manga, digital magazines and other fixed-layout scenarios not just for combining several fixed layouts but also allowing a reflowable rendition to be there as well, for accessibility. In that case the dual roots of an EPUB today are container.xml + encryption.xml (since the container.xml can point to multiple opf's).

I'm not really happy about what's going on with some of the current AHL work.

For example selecting multiple OPFs where we'll have to duplicate a lot of information doesn't sound like a very good move to me. We're trying to avoid creating new XML vocabularies, yet we're thinking of adding some right now in the EPUB3 WG.

Multiple renditions is fully possible within the scope of a single index.html (or OPF in EPUB3) file, and probably a better idea too (this avoids duplication). That's not something that this group is going to dive in but there's no reason why it couldn't be possible.

Lastly the seeming failure of the HTML5 "cache manifest" is cautionary: W3C seems to be backing away from it entirely and Google and Netscape and MSFT seem to be going with packaged web apps. Radicals like me were advocating for RSS back when we did EPUB 2 but in EPUB 3 the cache manifest was the bright shiny object. I haven't heard from W3C folks what the specific issues are but Daniel may know. I know when Google launched their new Packaged Apps at IO last year they motivated it by dinging online web apps inc. cache manifest by talking about limitations of their own Google Reader - http://www.youtube.com/watch?v=j8oFAr1YR-0 starting at about 3:00.

I've worked with Atom quite a lot (think of it as the clean version of RSS) for OPDS, and it's not a good fit for a spine or an alternative to what we're defining with index.html.

Too many requirements on <entry> and a single collection model (<feed>) per document, when we might need several lists in index.html.

Stepping back if you guys sort out networked (unpackaged, distributed, potentially dynamic) publications I think that would be a fantastic result. Today's EPUB is not used in any standard way in a unpackaged/networked scenario so it's a "greenfield" opportunity.

I don't think that we're directly discussing networked publications, but what we're defining could either work pretty well for a fully unpackaged, distributed publication, or for a partially distributed publication where some documents in the reading order are outside of the package.

Hadrien

Daniel Glazman

unread,

Mar 3, 2013, 10:10:59 AM3/3/13

to epu...@googlegroups.com

On 02/03/13 18:58, Baldur Bjarnason wrote:

> JSON would not serve the requirements.

Agreed 100%. All data in an ebook packaged should be potentially
rendered a way or another. Using JSON is will be a burden on viewing
and editing tools that will have to reserialize JSON into marked-up
languages for visual or audio restitution. If we go simple, that's then
an undesirable extra step.

</Daniel>

Alberto Pettarin

unread,

Mar 3, 2013, 1:47:36 PM3/3/13

to epu...@googlegroups.com

I vote against supporting font obfuscation.

From my experience with several Italian publishers, even those who were very jealous about the "look-and-feel" of their ebooks (especially w.r.t. the choice of fonts) are not going to embed them any longer, both for commercial reasons (= huge licensing costs) and for technical reasons (= readers complaining about the impossibility of overriding the publisher font on their not-so-shiny devices).

Hadrien Gardeur

unread,

Mar 5, 2013, 9:36:58 AM3/5/13

to Alberto Pettarin, epub-ng

We're discussing too many issues in the same threads, let's divide things properly from now on.

Reply all

Reply to author

Forward