Hello,
Using tesseract I am trying to output hexadecimal numbers (10 characters long) located on video screenshots. My results have very low positives.
The screenshots (1280x720 pixels) may or may not have text other than the hexadecimal number. Really, it doesn't matter if that text is output or not. The hexadecimal number can be located anywhere in the image.
Targeted text is always:
Hexadecimal characters (0-9, uppercase A-F)
10 characters long
Same font (open sans bold)
Same size (x height 11 pixels - but always uppercase)
This is what I've tried:
tesseract list.txt out -c tessedit_char_whitelist=0123456789ABCDEF
I have also tried disabling the dictionaries.
Is there anyway training could help me locate that text more reliably? Basically force tesseract to only look for one size and one font?
Thanks