How to make tesseract to recognize only numbers and no space

422 views
Skip to first unread message

Nuttivoot Chalongamonkul

unread,
Jan 8, 2014, 5:48:56 AM1/8/14
to tesser...@googlegroups.com
Hi
I want recognize only numbers and no space
The problem is that in the .txt file has space
I specify on the commandline:
tesseract image.png output digits

output is  01 3352003564002 I want to 013352003564002

How can I do ?

Michael Beauregard

unread,
Jan 9, 2014, 2:20:45 PM1/9/14
to tesser...@googlegroups.com
I have exactly the same issue. 

So far I've been building up a layer in my application that uses 'save_blob_choices' and 'save_best_choices' so that I can walk through the results using the ResultIterator and ChoiceIterator and rejecting/selecting choices based on regular expressions. This isn't going too bad, but I still need to handle the case where spaces are appearing unexpectedly. Also, regular expressions may not the best way to guide choice selection, but as a first attempt it's not terrible at least.

zdenko podobny

unread,
Jan 9, 2014, 2:56:35 PM1/9/14
to tesser...@googlegroups.com
Did you tried to set page segmentation mode to word?

Zdenko


--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
 
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages