Torrent of all books available

3,005 views
Skip to first unread message

A Hutch

unread,
Jan 6, 2021, 3:14:16 PM1/6/21
to Standard Ebooks
Hello all - there are many large communities of Data archivists across the internet, and while Gutenberg torrents exist, there does not seem to be a torrent for the Standard Ebooks collection.

Scraping the website would be possible, but poor practice - we don't want to stress the servers needlessly for each person who wants to archive the whole collection. Would it be possible/encouraged to put together the collection in torrent form?

Thank you

Alex Cabal

unread,
Jan 6, 2021, 3:17:03 PM1/6/21
to standar...@googlegroups.com
That's certainly something you could set up. Check out our OPDS feed:
standardebooks.org/opds

Note that we frequently update ebooks, even older ones, so the best
torrent would be one that frequently checks the feed for updated ebooks.
> --
> You received this message because you are subscribed to the Google
> Groups "Standard Ebooks" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to standardebook...@googlegroups.com
> <mailto:standardebook...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/standardebooks/3a5044cc-2c42-4e00-8d87-9c654df61a5cn%40googlegroups.com
> <https://groups.google.com/d/msgid/standardebooks/3a5044cc-2c42-4e00-8d87-9c654df61a5cn%40googlegroups.com?utm_medium=email&utm_source=footer>.

A Hutch

unread,
Jan 6, 2021, 4:37:37 PM1/6/21
to Standard Ebooks
I've thrown together a basic python script to pull all the available ePubs from that OPDS feed. I'll be taking a look at the changes made to it over the next few weeks to get an idea on how I'll handle updates.

As far as having a constantly-updated torrent I'm looking into BT's BEP 39 and BEP 46. Of course, automatic updates piping directly into the swarm raises security concerns, so I'll have to examine what safeguards I can put in place.

Bora M. Alper

unread,
Feb 27, 2021, 9:36:54 AM2/27/21
to Standard Ebooks
Another option is to create a requester-pays bucket on Amazon AWS S3 to let the downloaders pay for the requests + bandwidth instead, whilst SE covering for the cost of storing the data only. At the price of $0.023 per GB, I don't estimate this solution to cost more than fifty cents per month to SE, also being significantly simpler to maintain than regular BitTorrents or other solutions.

I would be happy to help you set this up if needed.

Regards,
Bora

Marshall Clow

unread,
Feb 27, 2021, 9:46:03 AM2/27/21
to standar...@googlegroups.com
On Feb 27, 2021, at 6:36 AM, Bora M. Alper <boram...@gmail.com> wrote:

Another option is to create a requester-pays bucket on Amazon AWS S3 to let the downloaders pay for the requests + bandwidth instead, whilst SE covering for the cost of storing the data only. At the price of $0.023 per GB, I don't estimate this solution to cost more than fifty cents per month to SE, also being significantly simpler to maintain than regular BitTorrents or other solutions.

I would be happy to help you set this up if needed.

I keep a local copy of all the standard ebooks (epubs only).
I have a Python script that keeps it up to date (from the OPDS feed).
Just checked the size of the library - it is 5.3MB.

— Marshall


To unsubscribe from this group and stop receiving emails from it, send an email to standardebook...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/standardebooks/26995350-df01-411c-98b9-6436b2b2e005n%40googlegroups.com.

Robin Whittleton

unread,
Feb 27, 2021, 9:50:19 AM2/27/21
to standar...@googlegroups.com
That can’t be right. The epub of Pepys’ Diary alone is 3.5MB. Did you mean 530MB?

-Robin

Marshall Clow

unread,
Feb 27, 2021, 10:29:50 AM2/27/21
to standar...@googlegroups.com
On Feb 27, 2021, at 6:50 AM, Robin Whittleton <ro...@reala.net> wrote:

That can’t be right. The epub of Pepys’ Diary alone is 3.5MB. Did you mean 530MB?

Sorry; something weird happened when I ran `du` - because it reported 53xxxxx bytes.
Running it a different way gives me about 2.3 GB.

Very strange; sorry for the noise.

— Marshall

Reply all
Reply to author
Forward
0 new messages