Caching in TrainLineRecognizer?

38 views
Skip to first unread message

Jens Weibler

unread,
Mar 5, 2017, 1:32:36 AM3/5/17
to tesseract-ocr
Hi,

I'm new to tesseract and wondered why the lstm dataset creation for the training process has to write the file again and again in TrainLineRecognizer. I've seen 200MB/s IO on the disk while creating the training data set.
As far I can see for the training case it would be sufficient to just load it once and write it at the end. The same applies to the box and tif file - but these are only read and not written...


Thanks,
Jens Weibler
Reply all
Reply to author
Forward
Message has been deleted
0 new messages