Training a new font with tesstrain.sh failed at phase M

203 views
Skip to first unread message

Kiều Vương

unread,
Sep 27, 2016, 12:32:02 AM9/27/16
to tesseract-ocr
- I need to train a new font on Ubuntu 14.04, tesseract 3.0.5, leptonica-1.73
- I had prepared: .tif file of font, font_properties text file and follow the instruction to train with tesstrain.sh.
- My command on cmd is: ./tesstrain.sh --lang eng --langdata_dir /home/ai02/github/langdata --tessdata_dir /usr/local/share/tessdata --fonts_dir /home/ai02/fonts_anhha --fontlist 'Palatino Sans Arabic Bold'
- The process always stop at "===Phase M : Clustering microfeatures (mfTraining) ===" with error: ERROR: /tmp/tmp.d9EjXiPeV5/eng/inttemp does not exist or is not readable. You can see detail in the tesstrain.log attachment file.
- I dont know why the error happened. I had tried to modify some sourcecode to traceback but gain nothing. I also change permission of tmp folder but it do not make sense. Do you guys meet the same problem, give me advice pls!

Thanks you so much!


 
FayezFont.ttf
font_properties
tesstrain.log

ShreeDevi Kumar

unread,
Sep 27, 2016, 1:50:14 AM9/27/16
to tesser...@googlegroups.com

Are you trying to train for English language with an Arabic font?


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a84e9c6a-8c41-475d-ad4b-3e97da5ba9cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ada...@turningcloud.com

unread,
Feb 12, 2018, 6:16:23 AM2/12/18
to tesseract-ocr

Hello shree,
The Issue is coming with me too andthe error is shown as below:

/home/adarsh/tesseract/font_properties does not exist or is not readable


On Tuesday, September 27, 2016 at 11:20:14 AM UTC+5:30, shree wrote:

Are you trying to train for English language with an Arabic font?

On 27 Sep 2016 10:01 a.m., "Kiều Vương" <vuongk...@gmail.com> wrote:
- I need to train a new font on Ubuntu 14.04, tesseract 3.0.5, leptonica-1.73
- I had prepared: .tif file of font, font_properties text file and follow the instruction to train with tesstrain.sh.
- My command on cmd is: ./tesstrain.sh --lang eng --langdata_dir /home/ai02/github/langdata --tessdata_dir /usr/local/share/tessdata --fonts_dir /home/ai02/fonts_anhha --fontlist 'Palatino Sans Arabic Bold'
- The process always stop at "===Phase M : Clustering microfeatures (mfTraining) ===" with error: ERROR: /tmp/tmp.d9EjXiPeV5/eng/inttemp does not exist or is not readable. You can see detail in the tesstrain.log attachment file.
- I dont know why the error happened. I had tried to modify some sourcecode to traceback but gain nothing. I also change permission of tmp folder but it do not make sense. Do you guys meet the same problem, give me advice pls!

Thanks you so much!


 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages