Hi Guys,
I have a task that convert pdf image into textual file.
The first thing i convert pdf to png by poppler-utils, it has better resolution than imagemagick.
The second thing, i specify the input png picture for tesseract to recognize words. But there are many errors.
I don't know whether different image formats influent ocr, tiff/png/jpeg? Can tesseract automatically resize picture to the best resolution, for example zoom in upto 100%?
Regards,
Levin