Failed to load list of training filenames from data/foo/list.train

180 views
Skip to first unread message

Omar Samir

unread,
Dec 30, 2023, 12:24:19 PM12/30/23
to tesseract-ocr
I was trying to train Tesseract-OCR on the ocrd-testset.zip in the README, and I get this error above in the subject 

Omar Samir

unread,
Dec 30, 2023, 12:24:20 PM12/30/23
to tesseract-ocr
I am trying to train tesseract using tesstrain on ocrd-testset.zip in the README, and I get this error above in the subject.
I have downloaded Tesseract using these two videos:

https://youtu.be/pe80OEJkS7U?si=AtfumDuDrMy8sXuO

I cloned tesstrain repository then called make tesseract-langdata and created data folder then created foo-grand-truth and unziped ocrd-testset.zip in it, and called make training.
This is the whole error:
   File 'data/foo.traineddata' does not exist.
     File 'data/foo/checkpoints/foo_checkpoint' does not exist.
       File 'unicharset' does not exist.
make: Warning: File 'data/foo/unicharset' has modification time 1125 s in the future
      Must remake target 'unicharset'.
      Successfully remade target file 'unicharset'.
       File 'lists' does not exist.
      Must remake target 'lists'.
      Successfully remade target file 'lists'.
    Must remake target 'data/foo/checkpoints/foo_checkpoint'.
lstmtraining \
  --debug_interval 0 \
  --traineddata data/foo/foo.traineddata \
  --learning_rate 0.002 \
  --net_spec "[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx192 O1c`head -n1 data/foo/unicharset`]" \
  --model_output data/foo/checkpoints/foo \
  --train_listfile data/foo/list.train \
  --eval_listfile data/foo/list.eval \
  --max_iterations 10000 \
  --target_error_rate 0.01
Failed to load list of training filenames from data/foo/list.train
make: *** [Makefile:332: data/foo/checkpoints/foo_checkpoint] Error 1

Zdenko Podobny

unread,
Jan 1, 2024, 12:24:19 PM1/1/24
to tesser...@googlegroups.com
Follow https://github.com/tesseract-ocr/tesstrain/blob/main/README.md
Tesseract OCR 3.05.02 was released 6 years ago...

Zdenko


so 30. 12. 2023 o 18:24 Omar Samir <omare...@gmail.com> napísal(a):
I was trying to train Tesseract-OCR on the ocrd-testset.zip in the README, and I get this error above in the subject 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/08f0c9a3-30fb-4239-a9bf-4acd75e0e9e4n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages