Tesseract does not recognize "comma" when using page segmentation = single char

838 views
Skip to first unread message

Zeulopes

unread,
Aug 4, 2013, 8:04:20 PM8/4/13
to tesser...@googlegroups.com
Hello Guys!

I'm using tesseract API (version 3.02) to single character recognition in English (eng.traineddata), with the following parameters:

api->SetPageSegMode(tesseract::PSM_SINGLE_CHAR);
and
api->SetVariable("tessedit_char_whitelist", "0123456789,");

When I pass an image containing a single digit (see image "three.png") it works very well, but it produces nothing (no results) when I pass an image of "comma" character (see image comma.png) even though included the "," on tessedit_char_whitelist variable.
I tried with different sizes and resolutions with no success.

Do you guys have any idea what's I'm doing wrong or even missing? Is there any way to "set up" tesseract to recognize the "comma"?

Best regards!

attached images are:


comma.png
three.png

Quan Nguyen

unread,
Aug 10, 2013, 11:12:56 AM8/10/13
to tesser...@googlegroups.com
You could have better results with 300-DPI, binary or grayscale images.
Reply all
Reply to author
Forward
0 new messages