Tesseract 4 can not load new osd.traineddata

669 views
Skip to first unread message

enkhbaata...@unimedia.co.jp

unread,
Sep 15, 2017, 2:04:27 AM9/15/17
to tesseract-ocr

I've downloaded and compiled latest tesseract source file from git repository but when i run tesseract engine with mode psm 0 or 1 it can't load osd.traineddata. My osd.traineddata file is from best directory. I tried with older osd.traineddata file from tessdata directory it works fine. Does anyone experience with this error and know fix for this problem? OS: Ubuntu 16.04.02 LTS 

ShreeDevi Kumar

unread,
Sep 15, 2017, 2:13:30 AM9/15/17
to tesser...@googlegroups.com
There have been updates to the tessdata files. Please see https://github.com/tesseract-ocr/tesseract/issues/995#issuecomment-329667239

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b1f02aa9-60a5-4267-abfb-07413cbaf8a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

enkhbaata...@unimedia.co.jp

unread,
Sep 15, 2017, 2:37:46 AM9/15/17
to tesseract-ocr
Yeah i've checked this update, but the problem is engine fails loading LSTM osd.traineddata.   


On Friday, September 15, 2017 at 2:13:30 PM UTC+8, shree wrote:
There have been updates to the tessdata files. Please see https://github.com/tesseract-ocr/tesseract/issues/995#issuecomment-329667239

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Fri, Sep 15, 2017 at 11:10 AM, <enkhbaata...@unimedia.co.jp> wrote:

I've downloaded and compiled latest tesseract source file from git repository but when i run tesseract engine with mode psm 0 or 1 it can't load osd.traineddata. My osd.traineddata file is from best directory. I tried with older osd.traineddata file from tessdata directory it works fine. Does anyone experience with this error and know fix for this problem? OS: Ubuntu 16.04.02 LTS 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

enkhbaata...@unimedia.co.jp

unread,
Sep 15, 2017, 2:47:30 AM9/15/17
to tesseract-ocr

traineddata that i can load with tesseract include files shown above image, second image shows extracted files from osd.traineddata which fails at tesseract load. 
 

On Friday, September 15, 2017 at 2:13:30 PM UTC+8, shree wrote:
There have been updates to the tessdata files. Please see https://github.com/tesseract-ocr/tesseract/issues/995#issuecomment-329667239

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Fri, Sep 15, 2017 at 11:10 AM, <enkhbaata...@unimedia.co.jp> wrote:

I've downloaded and compiled latest tesseract source file from git repository but when i run tesseract engine with mode psm 0 or 1 it can't load osd.traineddata. My osd.traineddata file is from best directory. I tried with older osd.traineddata file from tessdata directory it works fine. Does anyone experience with this error and know fix for this problem? OS: Ubuntu 16.04.02 LTS 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ShreeDevi Kumar

unread,
Sep 15, 2017, 2:49:58 AM9/15/17
to tesser...@googlegroups.com
The best traineddata, as you can see, does not have files supporting the legacy engine. It will not work with --oem 0.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

enkhbaata...@unimedia.co.jp

unread,
Sep 15, 2017, 3:05:02 AM9/15/17
to tesseract-ocr
i've tried running it with --oem 3 ( default is LSTM mode ) and --oem 1 too, both gives loading failure of osd.traineddata. Other lstm language traineddata files are working fine so i guess problem is loading only osd traineddata

ShreeDevi Kumar

unread,
Sep 15, 2017, 3:15:41 AM9/15/17
to tesser...@googlegroups.com
Please open an issue and provide all relevant information there. Thanks!

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

Zdenko Podobný

unread,
Sep 15, 2017, 3:28:39 AM9/15/17
to tesser...@googlegroups.com
First I would double check if OSD/landuage files was correctly downloaded (e.g. if they can be uncompress etc.)

Zdenko

On Fri, Sep 15, 2017 at 9:14 AM, ShreeDevi Kumar <shree...@gmail.com> wrote:
Please open an issue and provide all relevant information there. Thanks!

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

ShreeDevi Kumar

unread,
Sep 15, 2017, 3:55:07 AM9/15/17
to tesser...@googlegroups.com
The attached screenshots show that osd.traineddata has lstm related files in one version and legacy engine related files in the other one.  Please check the attachments in earlier messages.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

Quan Nguyen

unread,
Sep 15, 2017, 8:29:33 PM9/15/17
to tesseract-ocr
I've experienced the same issue with best osd.traineddata ever since it was published when testing Tess4J for Tesseract 4.0.0alpha.

ShreeDevi Kumar

unread,
Sep 15, 2017, 9:35:37 PM9/15/17
to tesser...@googlegroups.com

>jbreiden commented 7 hours ago
As per discussion with Ray, I have taken osd.traineddata from tessdata and copied it to both tessdata_fast and tessdata_best. This should resolve the problem and is the best we can do today. Some day in the future, hopefully, there will be an Orientation Script Dectector (OSD) that works with the LSTM recognizer.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages