Status: New
Owner: ----
New issue 1500 by
leopold....@gmail.com: Tesseract OCR force pattern
https://code.google.com/p/tesseract-ocr/issues/detail?id=1500
What steps will reproduce the problem?
1. Follow the bazaar tutorial
2. Test with simple image and pattern TEST/A/A/d/d/d
3. No filter at the result
What is the expected output? What do you see instead?
Expected : TESTAB123
See : TESTAB123
TESTABC12
TESTA1234
TEST12345
TESTABCD1
What version of the product are you using? On what operating system?
Tesseract 3
Windows 8
I want to read a specific character sequence with Tesseract wich contains
the word "TEST" followed by 2 characters and 3 digits.
I have tried bazaar matching pattern in Tesseract with the pattern
TEST\A\A\d\d\d
and ocr still recognize other words which doesn't match.
I have tried to use the "tessedit_char_whitelist" parameter but I can't
choose the position of the characters with that.
I launch the command : tesseract image.jpg result -l eng bazaar And I have
no error message, just :
"Tesseract Open Source OCR Engine v3.01 with Leptonica"
The result : TESTAB123 TESTABC12 TESTA1234 TEST12345 TESTABCD1
So it is wrong, I just wanted to catch the sequence "TESTAB123".
Can somebody tell me why the regular expression in my user-patterns file as
no effect ? For the configuration, I have STRICTLY followed the bazaar
tutorial.
Attachments:
image.jpg 31.0 KB
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings