tesseract-ocr is not converting or extracting the text properly

86 views

Skip to first unread message

Arul Britto Kumar Abraham

unread,

Nov 14, 2023, 11:18:09 PM11/14/23

to tesseract-ocr

Hi,

I am using tesseract-ocr in my python code to convert non-searchable pdf to searchable pdf document, it is not converting fully...

I am using "poppler-23.08.0" to convert the PDF page to images

from this image I am using "pytesseract.image_to_pdf_or_hocr" method to convert to PDF files and later I am combining all the page and making as single fine using PDFFileMerger.