My "language" make crash tesseract

5 views
Skip to first unread message

veilleux...@gmail.com

unread,
Jul 24, 2007, 8:10:01 AM7/24/07
to tesseract-ocr
I have made my own "language" by reading this
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract

this language is just made to recognize a font used on diagramms

And now when I try with my files , tesseract fail when it try to
recognize an image

Whit the debugger I have found that it bugs here in unicharset.cpp

const UNICHAR_ID UNICHARSET::unichar_to_id(const char* const
unichar_repr,int length) const {
assert(length > 0 && length <= UNICHAR_LEN);
assert(ids.contains(unichar_repr, length)); //<= bugs
here when unichar_repr = "m"
return ids.unichar_to_id(unichar_repr, length);
}

perhaps it fails cause in my unicharset i have only uppercase
letter ??

but why it detects low case letter on my picture which is one I used
to train tesseract
and which has only uppercase letters ??? (in fact there are only 40
char in my unicharset )

veilleux...@gmail.com

unread,
Jul 24, 2007, 8:40:46 AM7/24/07
to tesseract-ocr
Ok i have fixed my previous erros it was just about bad code in
unicharset file


withbl...@gmail.com

unread,
Jul 24, 2007, 1:20:23 PM7/24/07
to tesser...@googlegroups.com
Hi,
Will you please give extract  of errors noticed  in code as well extract of fixed code in unicharset file - for benefit of newbies
-74yrsold
Reply all
Reply to author
Forward
0 new messages