OCR on Nintendo game screenshots

735 views
Skip to first unread message

Leah Siddall

unread,
Apr 23, 2015, 2:38:29 AM4/23/15
to tesser...@googlegroups.com
Hi all! 

I am not having luck with tesseract and the fonts used in NES games like Super Mario Bros. 3. ( i've attached an example screenshot ).
My goal is scrape a screenshot for the "score" and "time remaining". The idea is to feed that into a database during a competition to minimize cheating. 

I've tried cropping, resizing, grayscale, and negating with PNG, TIF, JPG, and PNM formats then going through every PSM mode on each with poor results. 
The original screenshot is PNG 4800 × 3600 pixels at 144 pixels/inch straight from the emulator which is like the best possible situation. 

Just trying to get a baseline, I tried against the "Punch Out" screenshot ( attached ) where the fonts are clearly spaced and lots of empty space. It would get "CDHTIHUE" and "Nintendo", but totally missing the word "new" between the boxing gloves and and jumbling the year numbers. 

To rule out user error, I did run against other images with more standard fonts and had no problems. 

I'm quite comfortable with imagemagick but very new to tesseract. 
I am using tesseract version from "brew install tesseract -HEAD" on OSX 10.10.2
tesseract 3.04.00
 leptonica-1.71
  libjpeg 8d : libpng 1.6.16 : libtiff 4.0.3 : zlib 1.2.5

This would be really really cool to pull off if possible. any suggestions are greatly appreciated.
thanks!! -leah
Super Mario Bros. 3 2015-04-22 21.05.41.png
Punch-Out!! 2015-04-22 21.26.19.png

Dmitri Silaev

unread,
Apr 23, 2015, 5:51:55 AM4/23/15
to tesser...@googlegroups.com
Hmmm, fixed image size, fixed region, constant colors, monospace raster font...

Do you really want to engage a whole algorithmic monster to handle a problem like this? Not to mention poor performance, training, preprocessing, coping with all sorts of recognition problems is guaranteed.

Pixel-to-pixel matching is the way to go!
100% accuracy.

Even if you not willing to resort to full fledged programming - just crop out 10 digit samples and match them to your input image using a shell script loop. Give your ImageMagick-fu a chance. Or, you can even use file compare! ))

HTH

Best regards,
Dmitri Silaev
www.CustomOCR.com





--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2088977c-529b-45bd-8059-b6906fb666ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Leah Siddall

unread,
Apr 23, 2015, 3:28:14 PM4/23/15
to tesser...@googlegroups.com
thanks for your feedback! 

I was hoping to kinda not lock into one video game, so precision of where the high score may not be the same place will rule out cropping. I planned on doing a regex against whatever came back from tesseract. I was already counting on garbage information so there was going to be some light scripting wrapping this. 

But, when i cropped only to the "lower third" section of the mario screenshot, i was still not getting anything close to the score or time. Why is it struggling wit this font? it seems incredibly straight forward except that the "scores" are not a solid color with a border and sometimes they are touching. 

Since this is a new arena to me, can you point me in the right direction of researching how to do the "pixel-to-pixel" matching? 
And, I am new to the idea of training tesseract. can I train it to understand this font? 

This is more exploratory and fun for me, so I am very willing to learn the "correct way" of doing this. I just want to be pointed in the right direction. 

thanks again!!

Leah Siddall

unread,
Apr 23, 2015, 3:51:07 PM4/23/15
to tesser...@googlegroups.com
Also, I wanted to show the output from the lower third: 

E53333 I-I-I-I-I-|--E.|- $5 a].
ICED}: E]- EIEIEIEEHEIEJ GEE-'3

as you can see, i'm not even getting numbers. :/

Dmitri Silaev

unread,
Apr 23, 2015, 4:53:29 PM4/23/15
to tesser...@googlegroups.com
Don't waste your time with Tesseract here, I tell ya. You'd only get all sorts of unnecessary hassle. And what's most important, you'll be frustrated by accuracy.

By "pixel-to-pixel" I mean what is described e.g. here, section "Naive Template Matching":
http://docs.adaptive-vision.com/current/studio/machine_vision_guide/TemplateMatching.html

But in your case that wouldn't be dumb iteration over the entire image, but a single check in a fixed location whether the template image has exactly same pixels as the input image. You can arrange it like this:
- Crop out samples of all digits (each sized 85x60) -> digit0.png .. digit9.png
- Crop out the same sized rectangle from a fixed location of your source image - e.g. score digit #0 -> score0.png
- Do file compare score0.png to digit0.png
- If no match - try digit1.png
...
- Match found - this is your score digit #0
- Take next score digit
...
- Proceed to time digits
...
- Done

Simple!

Above approach probably would adapt for other games, and you'd manage to use same digit samples.
File compare might be replaced by XOR and then calculating the mean of all pixels (should be 0 if match).
There can be other methods of comparison. You get the point.

You'd better invest your time into accumulating a collection of score digit coordinates in each game, than into a struggle with quirky OCR results.

Well, unless you're eager to.

Leah Siddall

unread,
Apr 23, 2015, 6:00:49 PM4/23/15
to tesser...@googlegroups.com
*mind blown* this is a much better approach!! especially how quickly i found something like this: 

There will be a learning curve, but I agree this will be a much more accurate approach. 
The link you sent me is perfect for understanding the theory and possible workflow. 
Would you happen to have have another project like tesseract ( linux/osx based ) i could investigate to use for this purpose? 

and thank you very much for shifting my attention away from OCR. NES games can only have some many palettes ( which you can easily extract ) and restricted to certain sizes. so this should be easy to create a matching library by hand.

Dmitri Silaev

unread,
Apr 24, 2015, 5:11:00 AM4/24/15
to tesser...@googlegroups.com
Glad you found that approach appealing. Believe me, it's the best way to go. We at CustomOCR sometimes use it in our solutions to achieve 100% accuracy where appropriate.

As for dedicated software - can't name anything concrete, but I regularly see such things on the internet - like open source apps for pulling out text from old video game screenshots. Look for it yourself and you would find something for sure.

Anyway to do it by hand would be an easy and funny project for every programmer. I described my steps with ImageMagick and scripting in mind, but if you'd like to program yourself - go ahead. No math or other specific knowledge required.

smwikipedia smwikipedia

unread,
May 11, 2015, 11:03:52 PM5/11/15
to tesser...@googlegroups.com
Hi Leah,

I am having a similar issue to recognize raster fonts. Could you share your progress? Thanks!

在 2015年4月23日星期四 UTC+8下午2:38:29,Leah Siddall写道:
Reply all
Reply to author
Forward
0 new messages