Olivia
unread,Apr 13, 2023, 1:28:26 AM4/13/23Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Hello,
I have some old, busy documents that I'm trying to OCR. Tesseract does an incredible job with them out of the box (especially in comparison to other open source tools), but there are a few lines that it fails to detect in entirety. I've spent some time trying to figure out how Tesseract detects text lines to no real avail. So I have two questions for the community: how does Tesseract detect text lines, and if detection is ML-based, is it possible to fine-tune that model on our own datasets?
Thank you in advance for your answer!
Olivia