Getting time from image

47 views
Skip to first unread message

Ashwan Reddy

unread,
Oct 11, 2018, 9:24:46 AM10/11/18
to tesseract-ocr
Hi,

I'm trying to extract "8:56" from this image, which is cropped from a portion of a basketball broadcast. This command returns "757" using Tesseract 3.05, which is not the result I'm hoping for:

tesseract myimage.jpg -c tessedit_char_whitelist=0123456789:. -c tessedit_write_images=1 -psm 7 stdout

I've attached the tessinput image, which shows that the pre-processing steps basically remove the time entirely. Cropping the image to fit just the text area is not an option for my purposes unfortunately. Any ideas on how I could improve the result otherwise?

Thanks!
Ashwan
myimage.jpg
tessinput.tif

Adrian Owen

unread,
Oct 11, 2018, 9:42:49 AM10/11/18
to tesser...@googlegroups.com

HI Ashwan,

 

Gimp is your friend: https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy

 

If your programming, use KalikoImage library to replicate manual GIMP steps, that’s easy.

 

I found greyscale didn’t help.

YES: Long line removal (may not apply to you) (OpenCV)

YES: resize to 300DPI

YES: Apply filters

 

Hope this helps, Adrian

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ashwan Reddy

unread,
Oct 11, 2018, 9:47:57 AM10/11/18
to tesseract-ocr
Thanks for the suggestions, Adrian. I'll give those a try.

Ashwan

Soumik Ranjan Dasgupta

unread,
Oct 11, 2018, 12:05:44 PM10/11/18
to tesser...@googlegroups.com
As an alternative, you could try using the version 4 provided the image size is appropriate. I have used version 4 on raw images like these and the results came out unexpectedly well.


For more options, visit https://groups.google.com/d/optout.


--
Regards,
Soumik Ranjan Dasgupta
Reply all
Reply to author
Forward
0 new messages