Which is the option to make it limit output to the ASCII charset only?
Some letters such as "a" are outputted as glyph symbols.
That refers to an ancient version of Tesseract; since then, Tesseract
has added support for languages other than English, using Unicode by
default. I don't think there's any option to output to ASCII.
You might want to try something like unaccent (http://www.nongnu.org/unac/)
--
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to tesseract-oc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.