Help with lottery tickets

48 views
Skip to first unread message

Danilo Tuler

unread,
Nov 12, 2014, 9:51:07 PM11/12/14
to tesser...@googlegroups.com
Hi,

I'm trying to scan the attached lottery tickets. (not winning tickets unfortunately :-)
Those scans are grayscale 300dpi TIFF's.

I tried with the standard english language with little success.
Then I tried to create a new language, with one or two fonts, and train it.
The results were even worse.

What do you think?

Thanks,
Danilo

lot.zip

PorridgeBear

unread,
Nov 13, 2014, 10:19:13 AM11/13/14
to tesser...@googlegroups.com
You will need to perform some kind of pre-processing before sending it to Tesseract. 

For instance, if you always knew the ticket was a certain size and the image was always straight, you could first crop out the rectangular areas for each row (I'm assuming you are looking for row numbers here but the same applies to other areas).

Once you have done that, I ran Tesseract on the first row ...

1.8 27 29 3(2) 37 50

Nearly, but not quite.

I then created a thresholded image of the cropped image that made the black lines bolder.

18 27 29 30 37 50

Perfect.

Cheers
Reply all
Reply to author
Forward
0 new messages