Hello,
I attempted to run the following command
src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only --noextract_font_properties --langdata_dir ~/tesstutorial/langdata --tessdata_dir ~/tesstutorial/tesseract/tessdata --output_dir ~/tesstutorial/engtrain
(which is copied from the document Training Tesseract 4.00 in the section TessTutorial.
Everything seems to be going fine until it (spuriously?) generates an error message in the log file:
Rendered page 3355 to file /tmp/eng-2019-09-14.GmB/eng.Arial_Italic.exp0.tif
Rendered page 3370 to file /tmp/eng-2019-09-14.GmB/eng.Arial_Bold_Italic.exp0.tif
ERROR: Program text2image failed. Abort.
Rendered page 3367 to file /tmp/eng-2019-09-14.GmB/eng.Arial.exp0.tif
Rendered page 3356 to file /tmp/eng-2019-09-14.GmB/eng.Arial_Italic.exp0.tif
...
After this, training will continue and then end without copying anything out of the /tmp directory. In my case, it generated 7 of 8 box files as seen by a directory of /tmp/eng-2019-09-14.GmB:
dmaung@Rhinegeist1:~/Tesseract-git/tesseract$ ls -1 /tmp/eng-2019-09-14.GmB/
eng.Arial_Bold.exp0.box
eng.Arial_Bold.exp0.tif
eng.Arial_Bold_Italic.exp0.box
eng.Arial_Bold_Italic.exp0.tif
eng.Arial.exp0.box
eng.Arial.exp0.tif
eng.Arial_Italic.exp0.box
eng.Arial_Italic.exp0.tif
eng.Courier_New_Bold.exp0.box
eng.Courier_New_Bold.exp0.tif
eng.Courier_New_Bold_Italic.exp0.box
eng.Courier_New_Bold_Italic.exp0.tif
eng.Courier_New.exp0.tif
eng.Courier_New_Italic.exp0.box
eng.Courier_New_Italic.exp0.tif
tesstrain.log
Can anyone suggest how to debug what is causing text2image to fail or how to get around it?
David