tesseract-ocr is not converting or extracting the text properly

86 views
Skip to first unread message

Arul Britto Kumar Abraham

unread,
Nov 14, 2023, 11:18:09 PM11/14/23
to tesseract-ocr
Hi,

I am using tesseract-ocr in my python code to convert non-searchable pdf to searchable pdf document, it is not converting fully...



I am using "poppler-23.08.0" to convert the PDF page to images
from this image I am using  "pytesseract.image_to_pdf_or_hocr" method to convert to PDF files and later I am combining all the page and making as single fine using PDFFileMerger.

can anyone shar your thoughts here...

Reply all
Reply to author
Forward
0 new messages