Hi,
I have installed the tesseract-ocr (v3.03) package within Ubuntu 12.04.3 LTS and am experiencing difficulty getting it to run.
I first located the tessdata directory:
$ sudo find / -name tessdata
/usr/share/tesseract-ocr/tessdata
Then set the environment variable TESSDATA_PREFIX to tessdata's parent directory:
$ export TESSDATA_PREFIX=/usr/share/tesseract-ocr/
And then checked to make sure this was set correctly:
$ echo $TESSDATA_PREFIX
/usr/share/tesseract-ocr/
When trying to run tesseract against a test image I receive the following error:
$ tesseract test.png out
Error opening data file tesseract/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Tesseract Open Source OCR Engine v3.03 with Leptonica
AdaptedTemplates != NULL:Error:Assert failed:in file adaptmatch.cpp, line 174
Segmentation fault (core dumped)
The first line of the error suggests that tesseract is not taking the value from TESSDATA_PREFIX and is instead trying to locate the language files in '
tesseract/tessdata'.
Can anyone advise where I am going wrong?
Thanks,
Chris