It seems that tesstrain.sh creates a training file containing most of the bigrams, and then generates tif/box pairs from training file, but never uses the tif/box pairs. Is that intented?
(Also, the comment for the creation of the training file says "Take only the ngrams whose combined weight accounts for 95% of all the bigrams", but the code seems to use 99%.)
Thanks,
Eric.