I'd be grateful if someone could help me here.
Here is my request to Zdenko and the reply.Could you perhaps help me understand, and then change the page, the meaning of:
"A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.)"
I have no idea what this means or how to do it.
Well then it would better if you find something else than tesseract. Honestly. You will be lost and disappointed with tesseract because tesseract requires some knowledge (e.g. from image processing). It could be compared to university - if you got there it is expected that you finished your studies in high-school. Nobody there will bother to explain you basis... IMO there can not be clearer definition of x-height and what to do with it. BTW it is in FAQ and you complain about wrong information in Compilation wiki ;-)
Here is what the FAQ says:
There is a minimum text size for reasonable accuracy. You have to consider resolution as well as point size. Accuracy drops off below 10pt x 300dpi, rapidly below 8pt x 300dpi. A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.) At 10pt x 300dpi x-heights are typically about 20 pixels, although this can vary dramatically from font to font. Below an x-height of 10 pixels, you have very little chance of accurate results, and below about 8 pixels, most of the text will be "noise removed".So if someone could help me, I'm sure I wouldn't be the only one to benefit.
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
Cheers kevin1m...@gmail.com 32 Hawera Rd Kohimarama 1071 Auckland, New Zealand +64 (0)9 528 1174 home +64 (0)226 710 335 cell http://kmccready.wordpress.com