How the heck did you know that?
>
https://archive.org/details/in.ernet.dli.2015.210130
That's a much better link (to send to the other Great Bookers!).
> DOWNLOAD OPTIONS
> * ABBYY GZ download
> * DAISY download For print-disabled users
> * EPUB download
> * FULL TEXT download
Even though I was aiming for a PDF, a "full text" seems to be the most
native for a speech-to-text program, wouldn't you think it would be?
> * ITEM TILE download
> * KINDLE download
> * PDF download
> * PDF WITH TEXT download
> * SINGLE PAGE PROCESSED JP2 ZIP
Usually I'm comfortable starting with an EPUB or Kindle for conversion.
But what's the difference between "PDF" and "PDF with text" anyway?
> Their FULL TEXT and PDF WITH TEXT will be OCRed by them, so expect
> typical OCR errors in it.
How do you know that?
Are you saying the EPUB/Kindle are the most faithful then?
> Elijah
> ------
> does not know what all of the formats are
Kindle:
<
https://archive.org/download/in.ernet.dli.2015.210130/2015.210130.Private-Lives.mobi>
EPUB:
<
https://archive.org/download/in.ernet.dli.2015.210130/2015.210130.Private-Lives.epub>
I opened that EPUB file in the Windows Calibre program.
It had a mixture of mostly text, but some scanned pages.
The disclaimer at the beginning said:
"This book was produced in EPUB format by the Internet
Archive.The book pages were scanned and converted to EPUB
format automatically. This process relies on optical
character recognition, and is somewhat susceptible to
errors. The book may not offer the correct reading
sequence, and there may be weird characters, nonwords, and incorrect
guesses at structure. Some page numbers and headers or footers may remain
from the scanned page. The process which identifies images might have found
stray marks on the page which are not actually images from the book. The
hidden page numbering which may be available to your ereader corresponds to
the numbered pages in the print edition, but is not an exact match; page
numbers will increment at the same rate as the corresponding print edition,
but we may have started numbering before the print book's visible page
numbers. The Internet Archive is working to improve the scanning process
and resulting books, but in the meantime, we hope that this book will be
useful to you."
Using Calibre, I converted that 271KB EPUP into a 625KB PDF file instead.
Unlike before, the font is a normal font now, and it seems to be PDF text.
I think, thanks to you, that the mission was accomplished.
But I'll only know later when her iPad reads that PDF out as text.