Tesseract misses out single characters and recognizes single character wrong.

56 views
Skip to first unread message

Sa

unread,
Mar 25, 2019, 5:10:32 AM3/25/19
to tesseract-ocr
Hi all,


I have problems when using tesseract to recognize text from the uploaded image.

I used tesseract executable v4.0.0 released on 30. Oct. 2018 with best tessdata deu.traineddata and eng.traineddata. deu.traineddata was fine tuned with the font used in the image. I also used pytesseract as wrapper.

result = pytesseract.image_to_data(img, lang = 'deu+eng', config = '--psm 11')


I tried "lang = deu, psm = 10" as well. But none of them gave me satisfied results. 'y', 'a' and 'e' are always missed and 'ü', 'ö' and 'ä' are recognized as 'U', 'O' and 'a'. 

Please help me out. Can you get good results?


Thank you and best regards,

Sa
Tastatur_cropped.png
Reply all
Reply to author
Forward
0 new messages