Box recognition problem with tessedit_char_whitelist

303 views
Skip to first unread message

Alessandro Roma

unread,
May 12, 2014, 8:39:13 AM5/12/14
to tesser...@googlegroups.com
Good morning,

I'm having some problems on recognition using tessedit_char_whitelist

real value 586,67

with command   -psm 4 -c tessedit_char_whitelist=",0123456789."  i obtain  <336,267>       
with command   -psm 4 -c tessedit_char_whitelist="0123456789."  i obtain  < 58 6 267  >  (why here spaces?)
with command   -psm 4 -c tessedit_char_whitelist="0123456789,"  i obtain  < 58 6 ,267  >  (why here spaces?)   

dot and comma together seems have a bad influence on tesseract
psm value seems having no influence on results.

Any advice on this behaviour?

Thank you,
Alessandro.


crop_importo2.tif

Alessandro Roma

unread,
May 12, 2014, 8:53:14 AM5/12/14
to tesser...@googlegroups.com
I'm using Tesseract 3.03 build from source

zdenko podobny

unread,
May 12, 2014, 11:11:55 AM5/12/14
to tesser...@googlegroups.com
fix your image and you will get correct result.


Zdenko


On Mon, May 12, 2014 at 2:53 PM, Alessandro Roma <al.ba...@gmail.com> wrote:
I'm using Tesseract 3.03 build from source

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e759f388-f6a2-4f97-bca3-b09f25cfbab3%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

crop_importo2_fixed.png

Alessandro Roma

unread,
May 12, 2014, 11:50:37 AM5/12/14
to tesser...@googlegroups.com
Ehm, ok, but in real word is quite impossible a so cleaned image.

The image I posted is already automatically cleaned and look very very good.
You mean that tesseract understand pixels black on top as another char?? if so there is a way to tell him a minchar  height so he can ignore this pixels?
Reply all
Reply to author
Forward
0 new messages