Groups
Groups
Sign in
Groups
Groups
GITenberg Project
Conversations
About
Send feedback
Help
Project Gutenberg's epubmaker & PG RST
49 views
Skip to first unread message
Tom Morris
unread,
Mar 25, 2015, 9:32:33 PM
3/25/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to gitenber...@googlegroups.com
I haven't seen this mentioned anywhere. It's apparently what PG uses to generate their epub formats from either HTML or RST (mostly HTML).
- epubmaker -
http://pydoc.net/Python/epubmaker/0.3.20/
http://www.gutenberg.org/tools/
- epubmaker online conversion service -
http://epubmaker.pglaf.org/
- PG RST spec / test doc -
http://pgrst.pglaf.org/publish/181/181-h.html
Related PGDP docs
- HTML best practices doc -
http://www.pgdp.org/~jana/best-practices/pages/best-practices/
-
http://www.pgdp.net/wiki/The_Proofreader%27s_Guide_to_EPUB
Tom
Eric Hellman
unread,
Mar 26, 2015, 9:07:35 AM
3/26/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to gitenber...@googlegroups.com
Tom,
this is very useful, thanks.
We're experimenting with running epubmaker with Travis-CI. Not quite there yet.
Somehow I'd missed
http://upload.pglaf.org/
That fills in things I'd been wondering about.
Eric
Tom Morris
unread,
Mar 26, 2015, 4:11:30 PM
3/26/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to gitenber...@googlegroups.com
On Thu, Mar 26, 2015 at 9:07 AM, Eric Hellman
<
er...@hellman.net
>
wrote:
Somehow I'd missed
http://upload.pglaf.org/
That fills in things I'd been wondering about.
The sequence, as I understand it, is:
1. Get a copyright clearance line -
http://copy.pglaf.org/
2. Generate a text -
pgdp.net
for >50% - then validate by:
- run gutcheck
- run HTML validator -
http://validator.pglaf.org/
(clone of W3C service)
- run CSS validator -
http://jigsaw.w3.org/css-validator/
- run link validator -
http://validator.pglaf.org/checklink
- run dry run of epub converter -
http://epubmaker.pglaf.org/index.php
3. Upload the text for processing & publication -
http://upload.pglaf.org/
At DistributedProofreaders, the sequence is:
- scan book pages, either privately or by acquiring images from Internet Archive, Canadiana Online or other archives
- OCR (mostly using ABBYY FineReader
- three rounds of proofreading
- two rounds of formatting
- post processing of resulting UTF-8 text to create an HTML version (usually)
All proofing and formatting is done to the project manager's specs, largely based on global standards used by DP.
Post processors have a fair amount of latitude to generate the HTML as they see fit, as long as it falls within what can be handled by PG's epubmaker.
Tom
p.s. Looks like PG toyed with the a similar thing to Gitenberg -
http://www.gutenberg.org/wiki/Mercurial_Repository_How-To
-- perhaps a result of previous discussions about crowdsourcing
http://search.gmane.org/?query=crowdsourcing&group=gmane.culture.literature.e-books.gutenberg.volunteers
Reply all
Reply to author
Forward
0 new messages