Hi everyone!
I've got a lot of scanned documents with a font type and font size combination the tesseract recognizes with very bad quality. The documents are authored by a governmental office, so the obvious solution isnt work (to recreate with other font type)
I decided to train the tesseract, but any articles I found lacks steps, explains or something what essential to the success.
Is there any comprehensive tutorial or step-by-step guide for training you can advice?
environment: tesseract4.1 on ubuntu focal, everything is installed from distrib repository.
thanks in advance
Istvan