Error when running "make training" command

23 views
Skip to first unread message

Duy Hoàng

unread,
May 29, 2024, 9:45:49 AMMay 29
to tesseract-ocr
I'm creating a training file on windows based on the instructions here: https://github.com/tesseract-ocr/tesstrain/

I'am using tesseract ocr version 5.3.4
Can someone help me with this case

$ make training
You are using make version: 4.4.1
unicharset_extractor --output_unicharset "data/korletter/unicharset" --norm_mode 2 "data/korletter/all-gt"
Extracting unicharset from plain text file data/korletter/all-gt
Wrote unicharset file data/korletter/unicharset
python shuffle.py 0 "data/korletter/all-lstmf"
python generate_eval_train.py data/korletter/all-lstmf 0.90
dos2unix "data/korletter/korletter.numbers"
dos2unix: data/korletter/korletter.numbers: No such file or directory
dos2unix: Skipping data/korletter/korletter.numbers, not a regular file.
make: [Makefile:290: data/korletter/korletter.traineddata] Error 2 (ignored)
dos2unix "data/korletter/korletter.punc"
dos2unix: data/korletter/korletter.punc: No such file or directory
dos2unix: Skipping data/korletter/korletter.punc, not a regular file.
make: [Makefile:291: data/korletter/korletter.traineddata] Error 2 (ignored)
dos2unix "data/korletter/korletter.wordlist"
dos2unix: data/korletter/korletter.wordlist: No such file or directory
dos2unix: Skipping data/korletter/korletter.wordlist, not a regular file.
make: [Makefile:292: data/korletter/korletter.traineddata] Error 2 (ignored)
dos2unix "data/langdata/korletter/korletter.config"
dos2unix: data/langdata/korletter/korletter.config: No such file or directory
dos2unix: Skipping data/langdata/korletter/korletter.config, not a regular file.
make: [Makefile:293: data/korletter/korletter.traineddata] Error 2 (ignored)
combine_lang_model \
  --input_unicharset data/korletter/unicharset \
  --script_dir data/langdata \
  --numbers data/korletter/korletter.numbers \
  --puncs data/korletter/korletter.punc \
  --words data/korletter/korletter.wordlist \
  --output_dir data \
   \
  --lang korletter
Failed to read data from: data/korletter/korletter.wordlist
Failed to read data from: data/korletter/korletter.punc
Failed to read data from: data/korletter/korletter.numbers

Loaded unicharset of size 4 from file data/korletter/unicharset
Setting unichar properties
Setting script properties
Config file is optional, continuing...
Failed to read data from: data/langdata/korletter/korletter.config
Null char=2
Created data/korletter/korletter.traineddata
lstmtraining \
  --debug_interval 0 \
  --traineddata data/korletter/korletter.traineddata \
  --learning_rate 0.002 \
  --net_spec "[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx192 O1c4]" \
  --model_output data/korletter/checkpoints/korletter \
  --train_listfile data/korletter/list.train \
  --eval_listfile data/korletter/list.eval \
  --max_iterations 10000 \
  --target_error_rate 0.01 \
2>&1 | tee -a data/korletter/training.log
Failed to load list of training filenames from data/korletter/list.train

lstmtraining \
--stop_training \
--continue_from data/korletter/checkpoints/korletter_checkpoint \
--traineddata data/korletter/korletter.traineddata \
--model_output data/korletter.traineddata
Failed to read continue from: data/korletter/checkpoints/korletter_checkpoint
make: *** [Makefile:347: data/korletter.traineddata] Error 1

Zdenko Podobny

unread,
May 29, 2024, 1:27:15 PMMay 29
to tesser...@googlegroups.com
So:
  1. If you have a problem - use example data (ocrd-testset.zip) or provide your data set for reproducing the problem
  2. make sure you use the latest version of tesstrain
  3. ' make training' does not produce the output you presented. Provide real steps for reproducing the problem, if you are interested in help.

Zdenko


st 29. 5. 2024 o 15:45 Duy Hoàng <duyngu...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/397d129c-0e61-4003-9cb4-c6b7f8a615a8n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages