Possibility to call the pdf creation

72 views
Skip to first unread message

D

unread,
Dec 8, 2020, 1:19:30 PM12/8/20
to tesseract-ocr
Hey guys,

I produce a .hocr file with Google Cloud Vision and gcv2hocr.
I would like to know if there is an easy method to call the Tesseract PDF File creation, because it is better than the solutions found on GitHub.
My Goal is to create a PDF from the .hocr File and Image-File.
Happy for any kind of help!

Kind Regards,
Dennis

Max Rehberg

unread,
Sep 26, 2022, 12:45:58 PM9/26/22
to tesseract-ocr
I would like to do that as well. Is it possible?

Merlijn B.W. Wajer

unread,
Oct 30, 2022, 9:56:12 PM10/30/22
to tesser...@googlegroups.com

Hi,
Sorry for the delay in my reply, but I've created exactly this a few
years ago: https://github.com/internetarchive/archive-pdf-tools

Use recode_pdf to create a (optionally, compressed) PDF from a hOCR file
and a set of images. You might have to combine the hOCR files into a
single file, using https://github.com/internetarchive/archive-hocr-tools

Cheers,
Merlijn
Reply all
Reply to author
Forward
0 new messages