Pytesseract cant read my image(close letter problem)

378 views
Skip to first unread message

Dtractus

unread,
May 31, 2020, 3:19:53 PM5/31/20
to tesseract-ocr
I think pytesseract cant read my image because of close letters.


    img = Image.open('test2.png')
    text = pytesseract.image_to_string(img)
    print(text)
    with open('file.txt', mode = 'w') as f:
        f.write(text))




test2.png

Lorenzo Bolzani

unread,
May 31, 2020, 3:56:50 PM5/31/20
to tesser...@googlegroups.com
Hi, first invert the image.

$ tesseract -l eng test2.png -
FUTLutz

FUTSalkay
FUTLovazin
FUTRaum


Also upscale the image to twice the size to get text height about 30/50 pixels, this fixes the wrong letter.


Lorenzo

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2247dec4-8c66-4ecd-afc9-c4fea8bd86fb%40googlegroups.com.

Dtractus

unread,
May 31, 2020, 5:30:59 PM5/31/20
to tesseract-ocr
Hi, i changed to codes (add -l eng) but it cant still read.. 

My results;

ait
NaC]
ere
at

Codes ; 

    img = Image.open('test2.png')
    text = pytesseract.image_to_string(img, config='-l eng')
    print(text)

I dont understand what is the problem..


31 Mayıs 2020 Pazar 22:56:50 UTC+3 tarihinde Lorenzo Blz yazdı:
Hi, first invert the image.

$ tesseract -l eng test2.png -
FUTLutz

FUTSalkay
FUTLovazin
FUTRaum


Also upscale the image to twice the size to get text height about 30/50 pixels, this fixes the wrong letter.


Lorenzo

Il giorno dom 31 mag 2020 alle ore 21:19 Dtractus <dtra...@gmail.com> ha scritto:
I think pytesseract cant read my image because of close letters.


    img = Image.open('test2.png')
    text = pytesseract.image_to_string(img)
    print(text)
    with open('file.txt', mode = 'w') as f:
        f.write(text))




--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages