I've already configured tesseract to not use dictionaries when processing non-word images, but what is the best practice for tesstraining a non-word traineddata file?
Currently, I'm tesstraining using tessdata_best and my corrected ground-truth.
In my situation, is it better not to training with tessdata_best? Does tessdata_best contain word dictionaries that will needlessly bloat my traineddata file?
Thank you for your time and guidance.
Respectfully,
Gary