Hello,
I'm using Tesseract 4.1.0.0 trying to OCR a text field on the target that contains codes that
have a pattern ( implemented as pattern file in Tesseract terms):
P\n\n\n\n
C\n\n\n\n
B\n\n\n\n
U\n\n\n\n
In practice there is a letter that can be P or C, or B or U and then 4 more hex digits.
The length is always exactly 5 char in total.
So, at least in my intention with this pattern file, correct output would be, as examples:
P0123, P2EFD, C12EF, B2BCD and so on.
Running the script that does OCR thousands of times I see that the vast majority of the output is as
expected but I have also some results like PPB, PFF3,CC3 and so on.
Is there a way I can enforce more the adherence to the pattern I setup like this:
user_patterns_file=C:\Util\Code_OCR.Pattern
tessedit_char_whitelist=PCBU0123456789ABCDEF
tessedit_char_blacklist=abcdefGgHhIiLlMmNnOopQqRrSsTtuVvZzJjYyKkWw-!|
load_system_dawg=F
load_freq_dawg=F
Thanks in advance.