ERROR: Could not find training text file

474 views
Skip to first unread message

Ava Nimaee

unread,
Jul 31, 2017, 6:54:57 AM7/31/17
to tesseract-ocr
Hi . sorry I used this syntax:
training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \
  --noextract_font_properties --langdata_dir langdata \
  --tessdata_dir tessdata \
  --fontlist "Times New Roman," --output_dir engtrain
Befor that i create boxfile and tif and Ucnicahset_output
I clone langdata for tesseract v4.0
but take this error:
 === Phase I: Generating training images ===
ERROR: Could not find training text file langdata/eng/eng.training_text
i can't solve it and i don't know where should i put taining_text.txt actually it is a text file that i want train it.
Thanks for attention.

ShreeDevi Kumar

unread,
Jul 31, 2017, 7:40:14 AM7/31/17
to tesser...@googlegroups.com
add a line similar to following to your training command, pointing to where you have your training text

  --training_text ../langdata/eng/eng.training_text \

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a141d688-bc59-4485-b7bc-66ac650ebfd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ava Nimaee

unread,
Aug 4, 2017, 2:49:37 AM8/4/17
to tesseract-ocr
Thanks alot


On Monday, July 31, 2017 at 4:10:14 PM UTC+4:30, shree wrote:
add a line similar to following to your training command, pointing to where you have your training text

  --training_text ../langdata/eng/eng.training_text \

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Mon, Jul 31, 2017 at 4:24 PM, Ava Nimaee <beigy....@gmail.com> wrote:
Hi . sorry I used this syntax:
training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \
  --noextract_font_properties --langdata_dir langdata \
  --tessdata_dir tessdata \
  --fontlist "Times New Roman," --output_dir engtrain
Befor that i create boxfile and tif and Ucnicahset_output
I clone langdata for tesseract v4.0
but take this error:
 === Phase I: Generating training images ===
ERROR: Could not find training text file langdata/eng/eng.training_text
i can't solve it and i don't know where should i put taining_text.txt actually it is a text file that i want train it.
Thanks for attention.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ada...@turningcloud.com

unread,
Jan 30, 2018, 1:44:42 AM1/30/18
to tesseract-ocr
Do we need to have the langdata folder in some specific folder or is it that in this command we can give the path of the langdata folder?
this is your command
--training_text ../langdata/eng/eng.training_text \
this is what i typed:

$ training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only --noextract_font_properties --langdata_dir ../langdata --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain --training_text ../home/adarsh/tes1/tesseract/langdata/eng/eng.training_text \



On Monday, July 31, 2017 at 5:10:14 PM UTC+5:30, shree wrote:
add a line similar to following to your training command, pointing to where you have your training text

  --training_text ../langdata/eng/eng.training_text \

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Mon, Jul 31, 2017 at 4:24 PM, Ava Nimaee <beigy....@gmail.com> wrote:
Hi . sorry I used this syntax:
training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \
  --noextract_font_properties --langdata_dir langdata \
  --tessdata_dir tessdata \
  --fontlist "Times New Roman," --output_dir engtrain
Befor that i create boxfile and tif and Ucnicahset_output
I clone langdata for tesseract v4.0
but take this error:
 === Phase I: Generating training images ===
ERROR: Could not find training text file langdata/eng/eng.training_text
i can't solve it and i don't know where should i put taining_text.txt actually it is a text file that i want train it.
Thanks for attention.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ShreeDevi Kumar

unread,
Jan 30, 2018, 2:57:11 AM1/30/18
to tesser...@googlegroups.com
You need to give the path based on where you have the files.

Eg. Change langdata dir from ../langdata to ../home/adarsh/tes1/tesseract/langdata

Make sure it has other required files.

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages