Hi,
I wrote this small script to speed up
OCRD-train training startup.
It generates the boxes for all the images provided on the command line (it works only for single line images).
It is a simple conversion of the generate_line_box.py from ocrd-train. I used it once, it seems to work fine.
Currently with OCR-D the boxes and lstmf generation is very slow because it starts a new process for each image.
I execute this script before calling the makefile.
I do the "shell expansion" in python so that it can handle a very long list of files.
So you need to call it in this way:
python generate_all_line_boxes.py -i 'data/train/*.tif'
with single quotes to prevent shell expansion.
BTW, it would be nice to have the same thing for the lstmf files.
Bye
Lorenzo