Hello All!!
I have a big problem with tesseract-ocr.
I downloaded the example of use tesseract from the official page (net.sourceforge.tess4j.example) just for test how it works.
I downloaded too, almost all tessdata files (dunno what is the difference between these files) and run the java script (using net.sourceforge.tess4j).
I put very simple and easy tiff file for test, and results have not been so well. Some words have been recognized correctly, but the rest've been recognized like: BEST instead of DEST, DEF instead of DEP, etc.
I understand, that I should train my script how to recognize my picture (font, size, etc). But I dunno how to deal with it! Is there any documentation about these problem?
I know that some files should be put in tessdata directory, but how to create them?
I downloaded also jTessBoxEditor, put some demo image with my text, trained something in Trainer tab, but after training nothing have been done...
Can somebody help me or tell me how to solve my problems??
Many thanks for considering my request!