Single character recognition

896 views
Skip to first unread message

Vipul Aggarwal

unread,
Apr 10, 2014, 12:03:20 PM4/10/14
to tesser...@googlegroups.com
I am working on images with single character. 
However, tesseract is unable to recognize them.

This is how I initialized it:
        tess.Init(NULL, "eng", tesseract::OEM_DEFAULT);
tess.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK_VERT_TEXT);
tess.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ");

This is what i do for recognition:
tess.SetImage((uchar*)rotatedImage.data, rotatedImage.cols, rotatedImage.rows, 1, rotatedImage.cols);
char* out = tess.GetUTF8Text();


I am attaching images
1.jpg
2.jpg
3.jpg

Vipul Aggarwal

unread,
Apr 18, 2014, 3:37:03 AM4/18/14
to tesser...@googlegroups.com
Running tesseract through terminal leads to good result, but something might be wrong in my c++ implementation.
Please respond!

Paul

unread,
Apr 22, 2014, 4:31:44 AM4/22/14
to tesser...@googlegroups.com
Maybe you could try to use the ResultIterator.

Chris Dopuch

unread,
Apr 24, 2014, 5:11:58 PM4/24/14
to tesser...@googlegroups.com
Could you post the exact command line command you used to get good results for these images?

zdenko podobny

unread,
Apr 24, 2014, 5:30:46 PM4/24/14
to tesser...@googlegroups.com
  1. Why do you use PSM_SINGLE_BLOCK_VERT_TEXT for single character?
  2. If I got it right - if you use command line you read image data from disk, when you use API you read image data from memory (cv?). Try to avoid using different sources (e.g. read image from disk in case of API to avoid mistakes in reading from memory...)

Zdenko


--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vipul Aggarwal

unread,
Apr 25, 2014, 1:57:18 PM4/25/14
to tesser...@googlegroups.com
1. I use single char PSM only. It was experimentally changed. I forgot to change it back before posting.

2. Yes I was passing image with opencv earlier. I tried reading saving image and then reading it with leptonica and then passing it. Results got improved somehow.
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/prHJdPF873Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zKPpcEcHOp3HLF3nOd8uJnw1u4ZQwo473geip%3D9t9Egw%40mail.gmail.com.

Vipul Aggarwal

unread,
Apr 25, 2014, 1:58:28 PM4/25/14
to tesser...@googlegroups.com
Tesseract image.jpg -psm 10 out
--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/prHJdPF873Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/09a484c5-2015-445b-94fd-a8f1f551b2f4%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages