How to improve recognition for the given image?

88 views
Skip to first unread message

TrueWodzu

unread,
Mar 18, 2020, 7:04:10 AM3/18/20
to tesseract-ocr
Hi guys,

I am testing tesseract (4.0.0), I am a new user of the library. I've read the documentation about improving quality but it did not helpin this particular case.

155_2.jpg

The tiff image produced by tesseract is in the attachement.

The command I am running is tesseract.exe "155_2.jpg" output.txt --psm 7 -c tessedit_write_images=1

I've tried various psm values, without psm I get nothing, with psm 7 my reading is "195",with psm 8 there is no result.

I appreciate any advice you can give me.

Thanks!
tessinput.tif
155_2.jpg

Ha Duong

unread,
Mar 19, 2020, 5:01:53 AM3/19/20
to tesseract-ocr
Dear TrueWodzu,
Use file datatrain best for digits
And before you set image to OCR by Tesseract, you can pre processing image, your image is low quality.
You can use function Cv2.Erode() with size (5,5) and SetSetResolution(xDpi = 300, yDpi = 300) ( it's best to OCR) then use tesseract to OCR.

- This is image after process:

Capture.PNG

- Command: tesseract Capture.PNG output --psm 6 -l digits

- Result: 

result.PNG



TrueWodzu

unread,
Mar 19, 2020, 5:28:03 AM3/19/20
to tesseract-ocr
Hi,

Thank you for your answer.

I think it works kinda randomly. When I've rescaled image and decreased its resolution it was able to give me a proper read.
Reply all
Reply to author
Forward
0 new messages