newbie on training tesseract

60 views
Skip to first unread message

Istvan Kassai

unread,
Apr 28, 2021, 5:28:45 AM4/28/21
to tesseract-ocr
Hi everyone!

I've got a lot of scanned documents with a font type and font size combination the tesseract recognizes with very bad quality. The documents are authored by a governmental office, so the obvious solution isnt work (to recreate with other font type)
I decided to train the tesseract, but any articles I found lacks steps, explains or something what essential to the success.
Is there any comprehensive tutorial or step-by-step guide for training you can advice?

environment: tesseract4.1 on ubuntu focal, everything is installed from distrib repository.

thanks in advance
Istvan
Reply all
Reply to author
Forward
0 new messages