IDs in zhook

2 views
Skip to first unread message

Francisco Treacy

unread,
Aug 31, 2010, 5:56:49 PM8/31/10
to zhook...@googlegroups.com
Hi all,

How do I declare a book's ID in a zhook? I presume some meta property,
but I can't seem to find this in the spec.

Thanks,
Francisco

Joseph Pearson

unread,
Aug 31, 2010, 10:23:26 PM8/31/10
to zhook...@googlegroups.com
Short answer:

<meta name="bookid" content="xyz">

Long answer:

The unique book identifier is an EPUB concept that has no analogy in
Zhook. I'm not really a fan of the idea — it seems like half a
solution to a not particularly significant problem. For our project I
just use file hashes to identify identical ebook versions, and
internal database identifiers to link ebook versions to a
"publication".

So it really comes down to how your Zhook->EPUB conversion tool
obtains a unique id. Peregrin uses the "bookid" metadatum and
generates a 12-char random string if it's not found. Peregrin's
predecessor, chook (which is still active on publisher.ochook.org for
now), used the 'identifier' metadatum, else the 'isbn', else a URL-ish
construction based on the internal identifier for the upload.

I'm happy to take patches for Peregrin's behaviour — probably it
should fall back to ISBN like chook did, but I haven't given it enough
thought.

- J

--
Joseph Pearson | software inventor | inventivelabs.com.au | +61394163198

Francisco Treacy

unread,
Sep 1, 2010, 4:45:49 AM9/1/10
to zhook...@googlegroups.com
On Wed, Sep 1, 2010 at 4:23 AM, Joseph Pearson
<jos...@inventivelabs.com.au> wrote:
> Short answer:
>
>    <meta name="bookid" content="xyz">

I see. I should have looked into Peregrin... I blame my lack of sleep.

> Long answer:
>
> The unique book identifier is an EPUB concept that has no analogy in
> Zhook. I'm not really a fan of the idea — it seems like half a
> solution to a not particularly significant problem. For our project I
> just use file hashes to identify identical ebook versions, and
> internal database identifiers to link ebook versions to a
> "publication".

We agree it's not a particularly significant problem. We also use file
hashes. An ID field could come in handy, however, in the case of
having multiple versions/publications if you need to identify them as
siblings (don't know how you go about that with your internal IDs).

> So it really comes down to how your Zhook->EPUB conversion tool
> obtains a unique id. Peregrin uses the "bookid" metadatum and
> generates a 12-char random string if it's not found. Peregrin's
> predecessor, chook (which is still active on publisher.ochook.org for
> now), used the 'identifier' metadatum, else the 'isbn', else a URL-ish
> construction based on the internal identifier for the upload.
>
> I'm happy to take patches for Peregrin's behaviour — probably it
> should fall back to ISBN like chook did, but I haven't given it enough
> thought.

I might fix that. Not a priority for now.

Francisco

Joseph Pearson

unread,
Sep 1, 2010, 6:20:12 AM9/1/10
to zhook...@googlegroups.com

On 01/09/2010, at 6:45 PM, Francisco Treacy <fran...@widescript.com> wrote:

internal database identifiers to link ebook versions to a
"publication".

We agree it's not a particularly significant problem. We also use file
hashes. An ID field could come in handy, however, in the case of
having multiple versions/publications if you need to identify them as
siblings (don't know how you go about that with your internal IDs).

Yep, I know what you mean. More or less we leave it up to the user to update a publication with a different version of the book. But that may just be an artefact of our own design/architecture.

Happy to consider a small, obvious set of conventional metadata, among which we'd find a unique id. This kinda quickly gets caught up in weighty Dublin Core distractions, I think. Still, some solid ground would be useful to book designers. Is it reasonable to have a simple list like Title/Creator/Identifier/...?

- J

Francisco Treacy

unread,
Sep 1, 2010, 9:22:29 AM9/1/10
to zhook...@googlegroups.com
On Wed, Sep 1, 2010 at 12:20 PM, Joseph Pearson
<jos...@inventivelabs.com.au> wrote:
> Happy to consider a small, obvious set of conventional metadata, among which
> we'd find a unique id. This kinda quickly gets caught up in weighty Dublin
> Core distractions, I think. Still, some solid ground would be useful to book
> designers. Is it reasonable to have a simple list like
> Title/Creator/Identifier/...?

It is, but it should remain as simple as possible. I wouldn't require
any of those meta fields, but suggest authors including a title,
creator and identifier:

<meta name="title" content="The Elements of Style">
<meta name="creator" content="William Strunk Jr.">
<meta name="id" content="urn:uuid:6f82990c-9394-11df-920d-001cc0a62c0b">
<meta name="description" content="The Elements of Style is an
American English writing style guide. It is one of the most
influential and best-known prescriptive treatments of English grammar
and usage in the United States. It originally detailed eight
elementary rules of usage, ten elementary principles of composition,
and &quot;a few matters of form&quot; as well as a list of commonly
misused words and expressions. Updated editions of the paperback book
are often required reading for American high school and college
composition classes.">

among perhaps a handful of metadata that are common and make sense for
*all* publications. In the end, it's just a recommendation. You can
add whatever fields you like to your meta tags.

If authors don't provide all or part of them, it's up to the RS to
handle that case.

Francisco

Reply all
Reply to author
Forward
0 new messages