Hello Dev team,
I'm a newcomer to tesseract.
I am developing trained data files on Ge'ez script and some of it's child writing systems. Amharic and Tigrinya are the national languages of Ethiopia and Eritrea, respectively. I personally implement the files in my proprietary work but the language data will be release under GPLv3 for open-source use by others.
There is one file for Amharic floating around here:
http://code.google.com/p/tesseract-ocr/issues/detail?id=859However, this implementation only uses a limited number of fonts and does not include punctuation or char ambigs.
I need your advice -- jTessBoxEditor is lagging on my Mac when I open multipage tifs. What is the fastest box editor that supports multipage tifs and features deletion, merging and insertion of boxes?
Your advice will be invaluable in helping to expand Tesseract's multilingual support.
Thanks,