Project Gutenberg's epubmaker & PG RST

49 views
Skip to first unread message

Tom Morris

unread,
Mar 25, 2015, 9:32:33 PM3/25/15
to gitenber...@googlegroups.com
I haven't seen this mentioned anywhere.  It's apparently what PG uses to generate their epub formats from either HTML or RST (mostly HTML).

- epubmaker online conversion service - http://epubmaker.pglaf.org/

Related PGDP docs

Tom


Eric Hellman

unread,
Mar 26, 2015, 9:07:35 AM3/26/15
to gitenber...@googlegroups.com
Tom,

 this is very useful, thanks. 

We're experimenting with running epubmaker with Travis-CI. Not quite there yet.

Somehow I'd missed http://upload.pglaf.org/   That fills in things I'd been wondering about.

Eric

Tom Morris

unread,
Mar 26, 2015, 4:11:30 PM3/26/15
to gitenber...@googlegroups.com
On Thu, Mar 26, 2015 at 9:07 AM, Eric Hellman <er...@hellman.net> wrote:

Somehow I'd missed http://upload.pglaf.org/   That fills in things I'd been wondering about.

The sequence, as I understand it, is:

1. Get a copyright clearance line - http://copy.pglaf.org/

2. Generate a text - pgdp.net for >50% - then validate by:
  - run gutcheck
  - run HTML validator - http://validator.pglaf.org/ (clone of W3C service)
  - run CSS validator - http://jigsaw.w3.org/css-validator/
  - run link validator - http://validator.pglaf.org/checklink
  - run dry run of epub converter - http://epubmaker.pglaf.org/index.php

3. Upload the text for processing & publication - http://upload.pglaf.org/

At DistributedProofreaders, the sequence is:
- scan book pages, either privately or by acquiring images from Internet Archive, Canadiana Online or other archives
- OCR (mostly using ABBYY FineReader
- three rounds of proofreading
- two rounds of formatting
- post processing of resulting UTF-8 text to create an HTML version (usually)

All proofing and formatting is done to the project manager's specs, largely based on global standards used by DP.

Post processors have a fair amount of latitude to generate the HTML as they see fit, as long as it falls within what can be handled by PG's epubmaker.

Tom

p.s. Looks like PG toyed with the a similar thing to Gitenberg - http://www.gutenberg.org/wiki/Mercurial_Repository_How-To -- perhaps a result of previous discussions about crowdsourcing http://search.gmane.org/?query=crowdsourcing&group=gmane.culture.literature.e-books.gutenberg.volunteers

Reply all
Reply to author
Forward
0 new messages