tesseract PDF line offset in 4.0.0 alpha.

64 visualizzazioni
Passa al primo messaggio da leggere

Janpieter Sollie

da leggere,
8 lug 2017, 02:24:5508/07/17
a tesseract-ocr
Hello everyone,

I found out tesseract a few days ago, and am experimenting with it to make searchable PDFs.
But I have a few problems, and maybe one of you is able to help:
- when generating PDFs, all spaces are transferred to newlines.  This does not happen when generating a .txt file.  Why?  Is there a tesseract parameter which controls this behaviour?
- when generating PDFs, all text is selectable on the line under where the real text is placed.  Is there an offset in parameters I need to set to 0?

thanks!

DJArty

da leggere,
19 feb 2018, 07:54:0919/02/18
a tesseract-ocr
What exactly pdf viewer / rendered you use?  Did you try another one?

Rispondi a tutti
Rispondi all'autore
Inoltra
0 nuovi messaggi