Tesseract misses out single characters and recognizes single character wrong.

56 views

Skip to first unread message

Sa

unread,

Mar 25, 2019, 5:10:32 AM3/25/19

to tesseract-ocr

Hi all,

I have problems when using tesseract to recognize text from the uploaded image.

I used tesseract executable v4.0.0 released on 30. Oct. 2018 with best tessdata deu.traineddata and eng.traineddata. deu.traineddata was fine tuned with the font used in the image. I also used pytesseract as wrapper.

result = pytesseract.image_to_data(img, lang = 'deu+eng', config = '--psm 11')

I tried "lang = deu, psm = 10" as well. But none of them gave me satisfied results. 'y', 'a' and 'e' are always missed and 'ü', 'ö' and 'ä' are recognized as 'U', 'O' and 'a'.

Please help me out. Can you get good results?

Thank you and best regards,

Tastatur_cropped.png

Reply all

Reply to author

Forward

0 new messages