Concerns about the discussions happening here

54 views
Skip to first unread message

Daniel Glazman

unread,
Mar 10, 2013, 9:08:23 AM3/10/13
to epub-ng
Guys, I have deep concerns about the informal discussions happening
here. Granted, "informal" means anyone can send ideas, advocate for
them w/o having to care about IPRs or spec writing yet.

But when I see mentions of RDFa, DocBook, metadata vocabularies and the
html validator, I think this informal forum goes nowhere, or to be more
precise goes in a direction I am not comfortable with. I am not here
for something that will be bigger than EPUB3 and as complex as EPUB3.

I am not an epub reader implementor, I am an editing environment
implementor. Based on that experience, I can affirm EPUB3 is a hell
of a spec to implement and deal with. I have then the dream of a
dead simpler format that would be so close to the Web
we use every day in our browsers is that a web editor could trivially
become an ebook editor, that an unzipped ebook could be trivially opened
TOC INCLUDED in a web browser. I don't want namespaces everywhere, I
don't want extra CSS that main desktop browsers don't support, I don't
want complex standards that only 50 individuals on the planet can be
called an expert of. I want to be able to collect 15 articles from the
Web and trivially form an ebook from that. I don't want to need third-
party software for that if you exclude ZIP. I can accept creating a
trivial TOC in html linking to my articles. Not more.

At this point, I think Dave and I should just start editing an E0
Editor's Draft and submit it to a joint group between W3C to IDPF
for discussion and potentially standardization. I don't think we will
break the endless loops I see starting in this forum without a written
proposal on our radar.

</Daniel>

Hadrien Gardeur

unread,
Mar 10, 2013, 9:30:36 AM3/10/13
to Daniel Glazman, epub-ng


> I can accept creating a
> trivial TOC in html linking to my articles. Not more.

At this point we've agreed on two things : the zip is the manifest and that the ToC/reading order should be in a nav element of index.html

I don't think that we've strayed away from the initial goal and I'm sorry but I don't see how these two conclusions have anything to do with a format "more complex than EPUB3".

> At this point, I think Dave and I should just start editing an E0
> Editor's Draft and submit it to a joint group between W3C to IDPF
> for discussion and potentially standardization. I don't think we will
> break the endless loops I see starting in this forum without a written
> proposal on our radar.

A written proposal would certainly help, but there are still a few things that haven't been discussed (especially metadata).

There's a lot of value in having such discussions in the open, and although we may have long arguments about slightly more trivial things, we end up having a decision in the end (the discussion about ToC was a good example, with pros/cons about link@rel=next vs nav in HTML, but also discussions about reading order).

I don't think that a private discussion between two people and then two organizations would be anywhere nearly as useful.

The level of engagement that I see here is higher than most if not all the current IDPF WG, which is a good sign that people are interested and have things to say.
Where you see endless loops, I see engagement and people who usually don't have direct discussions aside from the comments of a blog post once in a while.

Hadrien

Alberto Pettarin

unread,
Mar 10, 2013, 9:34:10 AM3/10/13
to epu...@googlegroups.com
On 03/10/2013 02:08 PM, Daniel Glazman wrote:
> I am not an epub reader implementor, I am an editing environment
> implementor. Based on that experience, I can affirm EPUB3 is a hell
> of a spec to implement and deal with. I have then the dream of a
> dead simpler format that would be so close to the Web
> we use every day in our browsers is that a web editor could trivially
> become an ebook editor, that an unzipped ebook could be trivially opened
> TOC INCLUDED in a web browser. I don't want namespaces everywhere, I
> don't want extra CSS that main desktop browsers don't support, I don't
> want complex standards that only 50 individuals on the planet can be
> called an expert of. I want to be able to collect 15 articles from the
> Web and trivially form an ebook from that. I don't want to need third-
> party software for that if you exclude ZIP. I can accept creating a
> trivial TOC in html linking to my articles. Not more.

I am with Daniel w.r.t. the above. What Daniel proposes is sufficient
for >80% of the eBooks on the market, and it will make reading software
(RS) very easy to write, hopefully freeing development resources towards
implementing "advanced features" which most RSes now lack.

I am currently supervising a project by three students at my (former?)
research group in Padua, Italy, who are implementing an EPUB 3 reader
for Android, geared towards the research community (= notes,
multiple/linked viewports, resolution of external entities, etc.), to be
released under GPL before the summer.
I confirm that it is a bloody hell, to the point we are intentionally
not supporting parts of the EPUB 3 specs, and, on the other hand, we are
focusing on implementing "smart strategies" to deal with common
situations (like: footnotes/citation recognition, etc.), even without a
shared vocabulary to define them.

From my (admittedly, limited) experience, the e0 proposal consisting of:

1) ZIP container
2) index.html with two lists: a) "resources default order" (required)
and b) TOC (optional)

seems a good balance between use cases coverage and ease of
authoring/rendering. I even thought about making "index.html" optional:
if it is not present, the RS will use the lexicographic order of the
filenames in the ZIP archive as the "resources default order" and "TOC".

Apologies for the long email, have a good Sunday,

AlPe

Baldur Bjarnason

unread,
Mar 10, 2013, 10:22:27 AM3/10/13
to Daniel Glazman, epub-ng
A lot of the above is my fault for mentioning RDFa lite as a possibility in the first place and letting Bill drag me into a silly argument on XHTML versus HTML. I apologise for that.

I agree with you in that I personally don't see the need for either an epub:type alternative or the option of outlining spines and tocs separately but my reasoning was that since people are pressing for these features that I'd try to promote the simplest solution (epub-type). I should have pushed harder against the drive to complexity.

But, as I was walking back from the market with my curry in hand (a guy there sells an awesome Jamaican goat curry) a compromise solution crossed my mind:

Meaningful classes, microformat-style.

I.e. that we define two optional classes (spine, toc) that people can use to designate separate spines and tocs and skip the metadata question completely, and leave the epub:type style vocabulary thing to EPUB3.

These classes would be completely optional and that the default, in the absence of a spine class, would be to just use the first OL in the first NAV tag.

I think the entire metadata thing is overkill and is an area that we shouldn't be venturing into at all.

In any case, what you and Dave do is your call. And I apologise again for dragging things into the quagmire.

- best
- baldur











Bill McCoy

unread,
Mar 10, 2013, 11:37:52 AM3/10/13
to Daniel Glazman, epub-ng
Daniel,

I apologize as well for asking so many questions here on this list, especially as I represent in part the voice of "what is", not the voice of "what could be".

But, there are already a million ways to zip up a website. I didn't see anything in what Dave C originally suggested that wasn't already covered by relatively unsuccessful (in terms of broad adoption) and sorely under-specified formats like HPub and ZHook.  What I saw as most interesting was thinking out of the box about how it might be possible to improve EPUB in a hypothetical future version where we sacrificed backwards compatibility. 

If you and some others want to instead make a spec that takes another whack that the HPub/Zhook pinata, without having to answer annoying questions about use case scenarios or feature coverage vs. today's EPUB, or things like accessibility, go for it.

If that's the direction the group or a subset of decides to go - not to have a thought experiment about how we might improve EPUB if we sacrificed backwards compatibility - but something else entirely, something as you say smaller than EPUB, I would in that case like to request that you start using some other name that does not include "EPUB".

And if some IDPF member(s) want to at some point propose a WG about it via IDPF, of course proposals are always welcome.

As far as "complex standards that only 50 individuals on the planet can be called an expert of", as you know the Web itself contains over 100 different specifications, many of them more complicated than anything in EPUB itself. Even the panopoly of CSS with all of its modules arguably qualifies for your statement more so than anything in EPUB (at TOC I heard a story from Murray about he stumped all the experts with a question about running headers in CSS, which finally only Hakon could answer by pointing to a particular obscure corner of the CSS Generated Content spec). So I think in part you are reaching for something unrealistic in expecting that simplification of how Web content gets packaged into an eBook by being less clear about the specification thereof would make things any easier overall.

--Bill

Daniel Glazman

unread,
Mar 10, 2013, 1:53:04 PM3/10/13
to Bill McCoy, epub-ng
On 10/03/13 16:37, Bill McCoy wrote:
> Daniel,
>
> I apologize as well for asking so many questions here on this list,
> especially as I represent in part the voice of "what is", not the voice
> of "what could be".

Bill, you don't have to apologize, my message was NOT aimed at anyone,
in general or in particular.
I only expressed a feeling I got reading the threads.

> If you and some others want to instead make a spec that takes another
> whack that the HPub/Zhook pinata, without having to answer annoying
> questions about use case scenarios or feature coverage vs. today's EPUB,
> or things like accessibility, go for it.

Trust me, I don't want to spend time on something useless :-)
My spare cycles are unfortunately rare enough...

> If that's the direction the group or a subset of decides to go - not to
> have a thought experiment about how we might improve EPUB if we
> sacrificed backwards compatibility - but something else entirely,
> something as you say smaller than EPUB, I would in that case like to
> request that you start using some other name that does not include "EPUB".
>
> And if some IDPF member(s) want to at some point propose a WG about it
> via IDPF, of course proposals are always welcome.
>
> As far as "complex standards that only 50 individuals on the planet can
> be called an expert of", as you know the Web itself contains over 100
> different specifications, many of them more complicated than anything in
> EPUB itself. Even the panopoly of CSS with all of its modules arguably

Oh, I did not have EPUB in mind and I apologize if you thought
my words were targeting IDPF specs. You are absolutely right that CSS is
hard to understand at the deeper level. I was in fact thinking of
some other Web Standards... ;-)

> So I think in part you are reaching for something
> unrealistic in expecting that simplification of how Web content gets
> packaged into an eBook by being less clear about the specification
> thereof would make things any easier overall.

Wow, that sentence was complex to parse for a non native english
speaker :-D

</Daniel>



David Cramer

unread,
Mar 10, 2013, 1:57:19 PM3/10/13
to epub-ng
My original motivation was to ask, "how much simpler could EPUB be?" Books, even novels, are surprisingly complicated, and any attempt to represent the broad range of books is going to have some inherent complexity. But it seemed to me that, if you weren't worried about backward compatibility and history, than something nearly as functional as EPUB3 could exist, but be much simpler. And I think we've made tremendous progress here. Even though we haven't really settled a lot of big questions, just think what we've already eliminated:

mimetype, and the complex zip process that results in an uncompressed mimetype
META-INF
container.xml
most (if not all) namespaces
the package file (replaced by HTML)
the manifest
markup that's not in the HTML family at all.
triggers
fallbacks
CFI

That's quite a lot! 

Personally, I haven't been thinking of .e0 as a limited subset of EPUB—I hope it could be used for textbooks, educational materials, monographs as well as what the Inkling guy called "ten dollar text files." .e0 would be the container, but the web technology in the container would be unconstrained. 

* * *

On Mar 10, 2013, at 11:37 AM, Bill McCoy wrote:

If you and some others want to instead make a spec that takes another whack that the HPub/Zhook pinata, without having to answer annoying questions about use case scenarios or feature coverage vs. today's EPUB, or things like accessibility, go for it.

I don't know the history of HPub or Zhook, but I suspect they did not go through a process like this, and their goals weren't nearly as ambitious. So I don't see their history as especially relevant to us. 


If that's the direction the group or a subset of decides to go - not to have a thought experiment about how we might improve EPUB if we sacrificed backwards compatibility - but something else entirely, something as you say smaller than EPUB, I would in that case like to request that you start using some other name that does not include "EPUB".

Bill, since the mailing list is a free-form discussion of future directions for EPUB, are we OK with calling the list "EPUB NG" for now? I may change the name of the blog to e0 :)

* * *

On Mar 10, 2013, at 10:22 AM, Baldur Bjarnason wrote:

Meaningful classes, microformat-style.

I.e. that we define two optional classes (spine, toc) that people can use to designate separate spines and tocs and skip the metadata question completely, and leave the epub:type style vocabulary thing to EPUB3.

These classes would be completely optional and that the default, in the absence of a spine class, would be to just use the first OL in the first NAV tag.


I quite like that idea. We have lots of EPUB3 files that say

div class="body Chapter" epub:type="body chapter"

which seems a bit redundant. A lot of our class vocabulary is very similar to the epub:type vocabulary. We have been describing the components of books for hundreds of year :)


I think the entire metadata thing is overkill and is an area that we shouldn't be venturing into at all.

We need a tiny bit of metadata somewhere, if only so the external metadata can find the e-book. Horrified as people were by my initial example 

meta name="dcterms.identifier" content="9780316XXXXXX"

It's standard HTML, doesn't pose validation problems, gives us Dublin Core, and allows us both book-level metadata (on index.html) and document-level metadata (in html content docs). 


In any case, what you and Dave do is your call. And I apologise again for dragging things into the quagmire.

I'm in favor of everything being on the table, and have learned a lot from the discussion. 

* * *

On Mar 10, 2013, at 9:30 AM, Hadrien Gardeur wrote:

A written proposal would certainly help, but there are still a few things that haven't been discussed (especially metadata).

There's a lot of value in having such discussions in the open, and although we may have long arguments about slightly more trivial things, we end up having a decision in the end (the discussion about ToC was a good example, with pros/cons about link@rel=next vs nav in HTML, but also discussions about reading order).

Agreed.

I don't think that a private discussion between two people and then two organizations would be anywhere nearly as useful.

Agreed!

* * *

On Mar 10, 2013, at 9:08 AM, Daniel Glazman wrote:

Guys, I have deep concerns about the informal discussions happening
here. Granted, "informal" means anyone can send ideas, advocate for
them w/o having to care about IPRs or spec writing yet.

But when I see mentions of RDFa, DocBook, metadata vocabularies and the
html validator, I think this informal forum goes nowhere, or to be more
precise goes in a direction I am not comfortable with. I am not here
for something that will be bigger than EPUB3 and as complex as EPUB3.

I think I'm the one guilty of mentioning DocBook, but only to make the point that I was uncomfortable with HTML that would need a special validator due to non-standard additions. 


I am not an epub reader implementor, I am an editing environment
implementor. Based on that experience, I can affirm EPUB3 is a hell
of a spec to implement and deal with. I have then the dream of a
dead simpler format that would be so close to the Web
we use every day in our browsers is that a web editor could trivially
become an ebook editor, that an unzipped ebook could be trivially opened
TOC INCLUDED in a web browser. I don't want namespaces everywhere, I
don't want extra CSS that main desktop browsers don't support, I don't
want complex standards that only 50 individuals on the planet can be
called an expert of. I want to be able to collect 15 articles from the
Web and trivially form an ebook from that. I don't want to need third-
party software for that if you exclude ZIP. I can accept creating a
trivial TOC in html linking to my articles. Not more.

As I mentioned above, I think we've fought off namespaces, and there's been no mention of non-standard CSS. At the core, we're looking at index.html with a toc, an ordinary zip, and everything else is standard HTML. Whether we need to optionally extend that is the big question.


At this point, I think Dave and I should just start editing an E0
Editor's Draft and submit it to a joint group between W3C to IDPF
for discussion and potentially standardization. I don't think we will
break the endless loops I see starting in this forum without a written
proposal on our radar.

I can see value in drafting a spec, as a way to clarify our thinking. But I hope everyone will express their opinions—I've been rather heartened by much of the discussion here, as attempts to add complexity have been opposed, and there is a lot of knowledge and experience in this whole group. 

Dave


Bill McCoy

unread,
Mar 10, 2013, 2:11:45 PM3/10/13
to David Cramer, epub-ng
Hi Dave,

Re: what's eliminated and "markup that's not in the HTML family" - does e0 eliminate SVG and MathML? What about SMIL-based Media Overlays? I guess it's not clear waht you mean by "HTML family". If you mean W3C specifications this of course includes XML itself

Re:  .e0 not being  "a limited subset of EPUB—I hope it could be used for textbooks, educational materials, monographs as well as what the Inkling guy called 'ten dollar text files.'". EPUB 3 of course is used for  all of this, and is unconstrained regarding the web technology in the container with the proviso that the HTML be made into more easily parsable XHTML (e.g. 100% of HTML5 is supported), so if you are arguing that we should be continuing to operate on the thought experiment of how can we improve EPUB w/out backwards compatibility, I'm with you (and in that case I have no objections personally about what we call it for the time being).

If you are instead implying that e0 would be somehow *more* capable than EPUB 3, well I haven't seen anything concrete in the discussion so far to back that up. What I've heard is that it might be somewhat easier to hand author, and possibly easier to directly deploy from a website, at the expense of being a lot more difficult to reliably manipulate (remix, search, etc.), to validate or ingest into a distribution repository, possibly having more poorly defined security model, etc.. (i.e. requiring support for *both* HTML "tag soup" and XHTML is "simpler" only for the hand-coding author and for the use case of direct website deployment, but is actually more complicated for every other use case). But none of this has anything to do with basic capabilities regarding web technology. And an e0 ended up up not supporting accessibility requirements it would almost certainly not end up more useful for textbooks and educational materials.

--Bill

Daniel Glazman

unread,
Mar 10, 2013, 2:16:59 PM3/10/13
to epu...@googlegroups.com
On 10/03/13 18:57, David Cramer wrote:

>> If that's the direction the group or a subset of decides to go - not
>> to have a thought experiment about how we might improve EPUB if we
>> sacrificed backwards compatibility - but something else entirely,
>> something as you say smaller than EPUB, I would in that case like to
>> request that you start using some other name that does not include "EPUB".
>
> Bill, since the mailing list is a free-form discussion of future
> directions for EPUB, are we OK with calling the list "EPUB NG" for now?
> I may change the name of the blog to e0 :)

I agree with Bill. EPUB being a spec of IDPF, future of EPUB is
discussed there. E0 looks fine to me because I can read it as
"Ebook Zero". Dave, I agree with Bill here, we should get rid
of "epub" in this informal place's name.

</Daniel>

David Cramer

unread,
Mar 10, 2013, 2:31:59 PM3/10/13
to Bill McCoy, epub-ng
On Mar 10, 2013, at 2:11 PM, Bill McCoy wrote:

Hi Dave,

Re: what's eliminated and "markup that's not in the HTML family" - does e0 eliminate SVG and MathML? What about SMIL-based Media Overlays? I guess it's not clear waht you mean by "HTML family". If you mean W3C specifications this of course includes XML itself

Sorry for not being more clear! I do include SVG and MathML. I do not mean all W3C specs! SMIL is a tough one, and I don't know what to do about that. 


Re:  .e0 not being  "a limited subset of EPUB—I hope it could be used for textbooks, educational materials, monographs as well as what the Inkling guy called 'ten dollar text files.'". EPUB 3 of course is used for  all of this, and is unconstrained regarding the web technology in the container with the proviso that the HTML be made into more easily parsable XHTML (e.g. 100% of HTML5 is supported), so if you are arguing that we should be continuing to operate on the thought experiment of how can we improve EPUB w/out backwards compatibility, I'm with you (and in that case I have no objections personally about what we call it for the time being).

EPUB3 is constrained by the CSS profile, and I keep forgetting what's included and what isn't :)


If you are instead implying that e0 would be somehow *more* capable than EPUB 3, well I haven't seen anything concrete in the discussion so far to back that up. What I've heard is that it might be somewhat easier to hand author, and possibly easier to directly deploy from a website, at the expense of being a lot more difficult to reliably manipulate (remix, search, etc.), to validate or ingest into a distribution repository, possibly having more poorly defined security model, etc.. (i.e. requiring support for *both* HTML "tag soup" and XHTML is "simpler" only for the hand-coding author and for the use case of direct website deployment, but is actually more complicated for every other use case). But none of this has anything to do with basic capabilities regarding web technology. And an e0 ended up up not supporting accessibility requirements it would almost certainly not end up more useful for textbooks and educational materials.

I didn't mean to imply that e0 would be more capable. The only example I know of now would be that it would be easier.  to have both container-level metadata and document-level metadata.

I'm not sure how a garden-variety HTML5 fie in an e0 container would be harder to search than a garden-variety HTML5 file in an EPUB3 container, but then there is much I do not know about implementing reading systems!

Thanks,

Dave

Bill McCoy

unread,
Mar 10, 2013, 6:04:48 PM3/10/13
to David Cramer, epub-ng
Hi Dave,

Re:  "I do include SVG and MathML. I do not mean all W3C specs! SMIL is a tough one, and I don't know what to do about that.". Well, it would be nice to know what you do mean then because SVG and MathML are more in the XML family than the HTML family (and BWT note that EPUB 3.0 Media Overlays does not normatively reference all of SMIL but only defines a vocabulariy with a constrained subset thereof).

Re: "EPUB3 is constrained by the CSS profile". That is completely incorrect! EPUB 3 requires conformant Reading Systems to implement *at least* the defined EPUB 3 CSS profile but that profile is only a minimum bar to give some baseline guidance for content authors (as nothing in any W3C recs actually say what collection of CSS modules must/should be implemented by a browser, i.e. no such minimum bar is defined elsewhere). It is explicitly stated that a Reading System may implement more than this profile ( http://idpf.org/epub/30/spec/epub30-contentdocs.html#sec-css-rs-conf ).  So content developers for EPUB 3 ecosystem have a pretty full-fledged profile as a minimum bar with EPUB 3.0 and in practice can expect full CSS as supported by browser engines. This is a very big change from EPUB 2 world though, and I know content authors like you have been used to wrestling with pretty paltry EPUB 2 CSS support, thanks to non-browser-engine-based implementations, So I accept it may take a little while until the new reality sinks in. Also there are some vendors that have full CSS in their reading systems but impose limitations on or do machinations with CSS as part of iingestion of content into their retail environment. But this is nothing to do with EPUB 3's specs and could just as easily happen with a hypothetical e0 format.

Re: "I'm not sure how a garden-variety HTML5 fie in an e0 container would be harder to search than a garden-variety HTML5 file in an EPUB3 container" - I was talking about XHTML in EPUB 3 container. Given the relaxed parsing rules of ("tag soup") HTML5 it's pretty complex to handle in the general case (that is unless you think http://www.w3.org/html/wg/drafts/html/master/syntax.html#parsing is simple!). Of course if you want to search HTML5 and don't care if it's reliable, i.e. it's OK if your search breaks for some HTML5 files that browsers handle fine, then it's pretty easy (although it still requires writing some custom code vs. using XML libraries that are built-in to all platforms).  So I guess I would revise my claim as - again assuming you'll be given the job on a random development enviornment - it's going to be somewhat harder to search HTML5 than XHTML5 and a *lot* harder to do so robustly.

--Bill 
Reply all
Reply to author
Forward
0 new messages