This is very exciting news!
Thanks for the comprehensive update.
On 31/10/18 1:11 am,
er...@hellman.net wrote:
>
> A Milestone for GITenberg
> <
https://go-to-hellman.blogspot.com/2018/10/a-milestone-for-gitenberg.html>
>
>
> We've reached a big milestone for the GITenberg Project, which comes
> after a lot of work over 6 years by several groups of people. It's now
> ready to use!
>
> <
https://4.bp.blogspot.com/-VTNb7Jsq0sQ/W9h03TfxzgI/AAAAAAAABDg/CrZixjBS2iQFNa4Td8NpRgOCAswAf7HPwCEwYBhgL/s1600/GHgitenberg.png>
>
> GITenberg is a prototype that explores how Project Gutenberg
> <
https://www.gutenberg.org/> might work if all the Gutenberg texts were
> on Github <
https://github.com/>, so that tools like version control,
> continuous integration, and pull-request workflow could be employed. We
> hope that Project Gutenberg can take advantage of what we've learned;
> work in that direction has begun but needs resources and volunteers. Go
> check it out <
https://www.gitenberg.org/>!
>
> <
https://3.bp.blogspot.com/-92ergHgx6-I/W9h0NkwGE3I/AAAAAAAABDc/Q-CxZpdLZmMfoSiFTl_1jgXTT_Ke1O3ZQCEwYBhgL/s1600/booksplusgit.png>
> It's hard to believe, but GITenberg
> <
https://github.com/GITenberg> started 6 years ago when Seth Woodworth
> started making Github repos for Gutenberg texts. I joined the project
> two years later when I started doing the same and discovered that Seth
> was 43,000 repos ahead of me. The project got a big boost when the
> Knight Foundation <
https://knightfoundation.org/> awarded us a Prototype
> Fund grant
> <
https://americanlibrariesmagazine.org/2015/06/18/empowering-libraries-to-innovate/> to
> "explore the applicability of open-source methodologies to the
> maintenance of the cultural heritage" that is the Project Gutenberg
> collection. But there were big chunks of effort left to finish the work
> when that grant ended. Last year, six computer-science seniors from
> Stevens Institute of Technology <
https://www.stevens.edu/> took up the
> challenge and brought the project within sight of a major milestone (if
> not the finishing-line). There remained only the reprocessing of 58,000
> ebooks (with more being created every day!). As of last week, *we've
> done that! <
https://www.gitenberg.org/>* Whew.
>
> So here's what's been done:
>
> * Almost 57,000 texts from Project Gutenberg have been loaded into
> Github repositories.
> * EPUB, PDF, and Kindle Ebooks have been rebuilt and added to releases
> for all but about 100 of these.
> * Github webhooks trigger dockerized <
https://www.docker.com/> ebook
> building machines running on AWS Elastic Beanstock
> <
https://aws.amazon.com/elasticbeanstalk/> every time a git repo is
> tagged.
> * Toolchains for asciidoc <
http://asciidoc.org/>, HTML and plain text
> source files are running on the ebook builders.
> * A website at
https://www.gitenberg.org/ uses the webhooks to index
> and link to all of the ebooks.
> *
www.gitenberg.org <
https://www.gitenberg.org/> presents links to
> Github, Project Gutenberg, Librivox <
https://librivox.org/>, and
> Standard Ebooks. <
https://standardebooks.org/>
> * Cover images are supplied for every ebook.
> * Human-readable metadata files are available for every ebook
> * Syndication feeds for these books are made available in ONIX
> <
https://www.editeur.org/11/Books/>, MARC
> <
https://www.loc.gov/marc/> and OPDS <
http://opds-spec.org/> via
> Unglue.it <
https://unglue.it/api/help>.
>
> <
https://2.bp.blogspot.com/-2AVu2CVnJyM/W9h0Nouhn7I/AAAAAAAABDY/SpZ2vQk7uqgeZ6YhIVVLDA9gTL-A_bz1wCEwYBhgL/s1600/covers.png>
>
> Everything in this project is built in the hope that the bits can be
> incorporated into Project Gutenberg wherever appropriate. In January
> 2019 <
https://law.duke.edu/cspd/publicdomainday/>, the US public domain
> will resume the addition of new books, so it's more important than ever
> that we strengthen the infrastructure that supports it.
>
> Some details:
>
> * All of the software <
https://github.com/GITenberg-dev> that's been
> used is open source and content is openly licensed.
> * PG's epubmaker software
> <
https://github.com/gitenberg-dev/pg-epubmaker> has been
> significantly strengthened and improved.
> * About 200 PG ebooks have had fatal formatting errors remediated to
> allow for automated ebook file production.
> * 1,363 PG ebooks
> <
https://github.com/gitenberg-dev/gitberg/blob/master/gitenberg/data/missing.tsv> were
> omitted from this work due to licensing or because they aren't
> really books.
> * PG's RDF metadata files were converted to human-readable YAML and
> enhanced with data from New York Public Library and from Wikipedia.
> * Github API throttling limits the build/release rate to about 600
> ebooks/hour/login. A full build takes about 4 full days with one
> github login.
>
> Acknowledgements:
>
> * Seth Woodworth. In retrospect, the core idea was obvious, audacious,
> and crazy. Like all great ideas.
> * Github tech support. Always responsive.
> * The O'Reilly HTMLBook
> <
https://github.com/oreillymedia/HTMLBook> team. The asciidoc
> toolchain is based on their work.
> * Plympton. Many asciidoc versions were contributed to GITenberg as
> part of the "Recovering the Classics
> <
http://recoveringtheclassics.com/>" project. Thanks to Jenny 8.
> Lee, Michelle Cheng, Max Pevner and Nessie Fox.
> * Albert Carter and Paul Moss contributed to early versions of the
> GITeneberg website.
> * The Knight Foundation provided funding for GITenberg at a key
> juncture in the project's development though its prototype fund. The
> Knight Foundation supports public-benefitting innovation in so many
> ways even beyond the funding it provides, and we thank them with all
> our hearts.
> * Travis-CI <
https://travis-ci.org/>. The first version of automated
> ebook building took advantage of Travis-CI. Thanks!
> * Raymond Yee got the automated ebook building to actually work.
> * New York Public Library <
https://nypl.org/> contributed
> descriptions, rights info, and generative covers
> <
https://www.nypl.org/blog/2014/09/03/generative-ebook-covers>. They
> also sponsored hackathons that significantly advanced the
> environment for public domain books. Special thanks to Leonard
> Richardson, Mauricio Giraldo and Jens Troeger (Bookalope).
> * My Board at the Free Ebook Foundation
> <
https://ebookfoundation.org/>: Seth, Vicky Reich, Rupert Gatti,
> Todd Carpenter, Michael Wolfe and Karen Liu. Yes, we're overdue for
> a board meeting...
> * The Stevens GITenberg team: Marc Gotliboym, Nicholas Tang-Mifsud,
> Brian Silverman, Brandon Rothweiler, Meng Qiu, and Ankur Ramesh.
> They redesigned the
gitenberg.org website, added search, added
> automatic metadata updates, and built the dockerized elastic
> beanstalk ebook-builder and queuing system
> <
https://github.com/gitenberg-dev/gitberg-autoupdate>. This work was
> done as part of their two-semester capstone (project) course. The
> course is taught by Prof. David Klappholz, who managed a total of 23
> student projects last academic year. Students in the course design
> and develop software for established companies, early stage
> startups, nonprofits, gov't agencies, etc., etc. Take a look at
> detailed information
> <
https://sites.google.com/view/sitseniordesign/home?authuser=0> about software
> that has been developed over the past 6-7 years and details of how
> the course works.
> * Last, but certainly not least, Greg Newby (Project Gutenberg) for
> consistent encouragement and tolerance of our nit-discovery, Juliet
> Sutherland (Distributed Proofreaders <
https://www.pgdp.net/c/>) for
> her invaluable insights into how PG ebooks get made, and to the
> countless volunteers at both organizations who collectively have
> made possible the preservation and reuse of our public domain.
>
> I'm sure I've omitted an important acknowledgement or two - please let
> me know so I can rectify the omission.
>
> So what's next? As I mentioned, we've taken some baby steps towards
> applying version control <
https://github.com/gutenbergbooks> to Project
> Gutenberg. But Project Gutenberg is a complex organism, and implementing
> profound changes will require broad consensus-building and resource
> gathering (both money and talent). Project Gutenberg
> <
https://www.gutenberg.org/> and the Free Ebook Foundation
> <
https://ebookfoundation.org/> are very lean non-profit organizations
> dependent on volunteers and small donations. What's next is really up to
> you!
>
> --
> You received this message because you are subscribed to the Google
> Groups "GITenberg Project" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
gitenberg-proj...@googlegroups.com
> <mailto:
gitenberg-proj...@googlegroups.com>.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/gitenberg-project/a22d0771-98eb-4902-8930-f8981667bd72%40googlegroups.com
> <
https://groups.google.com/d/msgid/gitenberg-project/a22d0771-98eb-4902-8930-f8981667bd72%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit
https://groups.google.com/d/optout.