languages not loaded

170 views
Skip to first unread message

Evgeni Dimitrov

unread,
Jan 31, 2017, 1:30:58 PM1/31/17
to tesseract-ocr
Hello,
At
https://github.com/tesseract-ocr/tesseract/wiki

I saw there is a windows installer for Tesseract 3.05-dev from Tesseract at UB Mannheim at
https://github.com/UB-Mannheim/tesseract/wiki

I installed it. It works with English. On the last page there is a link to "download the appropriate training data" for another language at
https://github.com/tesseract-ocr/tessdata

On this page it says "These language data files only work with Tesseract 4. Get language data files for Tesseract 3.04 or 3.05 from the 3.04 tree.". The new link is:
https://github.com/tesseract-ocr/tessdata/tree/3.04.00

I downloaded from there bul.traineddata and deu.traineddata and copied to
. . . \Tesseract-OCR\tessdata
. Now tesseract --list-langs shows
bul
deu
eng
osd

But it works really with English. For Bulgarian and for German the message is:
Failed loading language 'deu'
Tesseract couldn't load any languages!
Could not initialize tesseract.

Any advice - what could be wrong?

ShreeDevi Kumar

unread,
Jan 31, 2017, 9:44:09 PM1/31/17
to tesser...@googlegroups.com
check the value of TESSDATA-PREFIX environment variable and what traineddata files are in that location.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d4c5a5d6-66d5-4b92-bedc-3100477c6512%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Evgeni Dimitrov

unread,
Feb 1, 2017, 4:13:34 AM2/1/17
to tesseract-ocr
Thank you ShreeDevi,

I tried first time with TESSDATA-PREFIX pointing to
xxx\Tesseract-OCR

I tried second time with TESSDATA-PREFIX pointing to
xxx\Tesseract-OCR\tessdata

I have in xxx\Tesseract-OCR\tessdata:
configs, tessconfigs, bul.traineddata, deu.traineddata, eng.cube.bigrams, eng.cube.fold, eng.cube.lm, eng.cube.nn, eng.cube.params, eng.cube.size, eng.cube.word-freq, eng.tesseract_cube.nn, eng.traineddata, eng.user-patterns, eng.user-words, osd.traineddata, pdf.ttf

In both cases the result is as described above.

Best regards
Evgeni
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ShreeDevi Kumar

unread,
Feb 1, 2017, 8:08:48 AM2/1/17
to tesser...@googlegroups.com
check the size of the traineddata files and match against the repo to confirm that they were downloaded correctly.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

Evgeni Dimitrov

unread,
Feb 1, 2017, 8:55:04 AM2/1/17
to tesseract-ocr
I downloaded from
https://github.com/tesseract-ocr/tessdata/tree/3.04.00

the sizes of the downloaded are
31/01/2017  13:14            33,854 bul.traineddata
31/01/2017  14:53            33,855 deu.traineddata

Evgeni Dimitrov

unread,
Feb 1, 2017, 9:03:51 AM2/1/17
to tesseract-ocr
Thanks.
I downloaded again and it works fine.
Reply all
Reply to author
Forward
0 new messages