Trained Tesseract but still cannot properly find a char...help appreciated.

46 views

Skip to first unread message

James Melby

unread,

May 28, 2015, 4:27:29 PM5/28/15

to tesser...@googlegroups.com

I trained Tesseract (based on the eng language) to work with a particular customer derived font.

hen I have finished training, I use other image files and try and scan for characters, during this I get errors in reading a 0 (zero). It will come up sometimes as an 8, Q, D or an O (the letter).

The images I am using are not mixing up font size nor putting in lower case.

I have tried using the unicharambigs file, but I'm not sure that it is being implemented correctly. I renamed it to eng.unicharambigs as well and nothing happened there. I even tried to set the Type Indicator value to 1 to mandate the substitution and nothing happened.

Anyway, I feel I'm missing something simple, but have wound myself around this problem where I put myself in the middle of the forest.

If someone can point me in a proper direction or give a few pointers I would appreciate the help.

The files created or used:

Trained Data: https://drive.google.com/file/d/0B3S3PcVl6aznbHh5TW9HX3FCdlU/view?usp=sharing

Image: https://drive.google.com/file/d/0B3S3PcVl6azndXRvcGhCRlJRUWc/view?usp=sharing

Box: https://drive.google.com/file/d/0B3S3PcVl6aznYWNZaUJ2RVFDU2s/view?usp=sharing

unicharambigs: https://drive.google.com/file/d/0B3S3PcVl6aznUHdKX2RucExBaXM/view?usp=sharing

font_properties: https://drive.google.com/file/d/0B3S3PcVl6aznTVl6OUNvRnh0LXM/view?usp=sharing

Thanks,

Jim

Reply all

Reply to author

Forward

0 new messages