ERROR: exp0.box does not exist or is not readable

218 views
Skip to first unread message

Fanatico

unread,
Apr 6, 2018, 7:21:11 PM4/6/18
to tesseract-ocr
I'm trying to execute the training from the 4.o tutorial, but I'm getting an error, can someone help with this?

Platform: MAC OS X 10.13.3
Tesseract: 4.0.0-beta.1
leptonica: 1.75.3
libjpeg 9c : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11


Code used

../../tesseract/training/tesstrain.sh \
  --fonts_dir /Library/Fonts \
  --lang eng --linedata_only \
  --noextract_font_properties \
  --exposures "0"    \
  --langdata_dir ../../langdata \
  --tessdata_dir /usr/local/Cellar/tesseract/HEAD-f8e26ee/share/tessdata \
  --fontlist "Verdana" \
  --output_dir .~/tesstutorial/ara

Result

=== Starting training for language 'eng'
[Fri Apr 6 20:19:15 -03 2018] /usr/local/bin/text2image --fonts_dir=/Library/Fonts --font=Verdana --outputbase=/var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/font_tmp.XXXXXXXXXX.aU9oTb7N/sample_text.txt --text=/var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/font_tmp.XXXXXXXXXX.aU9oTb7N/sample_text.txt --fontconfig_tmpdir=/var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/font_tmp.XXXXXXXXXX.aU9oTb7N

=== Phase I: Generating training images ===
Rendering using Verdana
[Fri Apr 6 20:19:17 -03 2018] /usr/local/bin/text2image --fontconfig_tmpdir=/var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/font_tmp.XXXXXXXXXX.aU9oTb7N --fonts_dir=/Library/Fonts --strip_unrenderable_words --leading=32 --char_spacing=0.0 --exposure=0 --outputbase=/var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/tmp.OaBuo1g2/eng/eng.Verdana.exp0 --max_pages=3 --font=Verdana --text=../../langdata/eng/eng.training_text
ERROR: /var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/tmp.OaBuo1g2/eng/eng.Verdana.exp0.box does not exist or is not readable
ERROR: /var/folders/xl/gqcd7ljn0k7d3r_3j9dy7x340000gn/T/tmp.OaBuo1g2/eng/eng.Verdana.exp0.box does not exist or is not readable

Observations

I can find the fond if I use:

text2image --list_available_fonts --fonts_dir=/Library/Fonts


I tested some other fonts.

Thanks for the time and reply!

ShreeDevi Kumar

unread,
Apr 6, 2018, 10:28:06 PM4/6/18
to tesser...@googlegroups.com
Is your langdata in   --langdata_dir ../../langdata

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/cbe9828e-690f-4bc4-8592-d195370d4857%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fanatico

unread,
Apr 7, 2018, 1:30:44 AM4/7/18
to tesseract-ocr
Yes the location is correct,  I tried to put the full path to the folder and go the same error. 

ShreeDevi Kumar

unread,
Apr 7, 2018, 3:35:36 AM4/7/18
to tesser...@googlegroups.com
Look in your tmp directory in the sub folders referred in the console output

Check the log file and other files there

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

Fanatico

unread,
Apr 7, 2018, 3:43:25 AM4/7/18
to tesseract-ocr
Thanks for the reply, but I just fixed this bug, the problem is that the var PANGOCAIRO_BACKEND was empty on MAC OSX so I needed to set it before executing the code. Something like this:

PANGOCAIRO_BACKEND=fc \
../../tesseract/training/tesstrain.sh \
  --fonts_dir /Library/Fonts \
  --lang eng --linedata_only \
  --noextract_font_properties \
  --exposures "0"    \
  --langdata_dir ../../langdata \
  --tessdata_dir /usr/local/Cellar/tesseract/HEAD-f8e26ee/share/tessdata \
  --fontlist "Verdana" \
  --output_dir .~/tesstutorial/eng

If someone need more details please look here: https://github.com/tesseract-ocr/tesseract/issues/736
Reply all
Reply to author
Forward
0 new messages