Obtain both PDF and HOCR output from single scan?

30 de afișări
Accesați primul mesaj necitit

Chris Falter

necitită,
11 mar. 2020, 18:29:4911.03.2020
– tesseract-ocr
Hi, 

My project is using Tesseract 4.x to scan multi-page TIFFs. We need to obtain HOCR output to perform some analytics, and we need to obtain a searchable PDF to interact with a different system.

The documentation shows how to make Tesseract produce either a HOCR or a PDF. Is it possible to produce both with a single scan? Or do we have to do 2 scans, one for each output?

Thanks in advance for your help! And I apologize if my search through this forum's messages failed to find an answer that already exists.

Best,
Chris Falter


Shree Devi Kumar

necitită,
11 mar. 2020, 21:39:1411.03.2020
– tesseract-ocr
Use both at end of command line eg.

tesseract image outbase -l foo --oem 1 hocr pdf

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/cc893bda-fc7c-42ab-9bcc-22ab25b5c846%40googlegroups.com.

Chris Falter

necitită,
12 mar. 2020, 11:36:4112.03.2020
– tesseract-ocr
Thanks!


On Wednesday, March 11, 2020 at 9:39:14 PM UTC-4, shree wrote:
Use both at end of command line eg.

tesseract image outbase -l foo --oem 1 hocr pdf

On Thu, Mar 12, 2020, 03:59 Chris Falter <chris...@gmail.com> wrote:
Hi, 

My project is using Tesseract 4.x to scan multi-page TIFFs. We need to obtain HOCR output to perform some analytics, and we need to obtain a searchable PDF to interact with a different system.

The documentation shows how to make Tesseract produce either a HOCR or a PDF. Is it possible to produce both with a single scan? Or do we have to do 2 scans, one for each output?

Thanks in advance for your help! And I apologize if my search through this forum's messages failed to find an answer that already exists.

Best,
Chris Falter


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
Răspundeți tuturor
Răspundeți autorului
Redirecționați
0 mesaje noi