I try to use tesseract 3.03 to OCR scanned pages.
In many cases 1 scan job contains many jobs and they are separated by feeding a special spearator page between the jobs to separate them.
This page contains only 12 "T" on the left top of the page (and a second line head down at the right bottom).
I tried a lot, but it seems that tesseract completely ignores this text, even the scan looks great. That page is completely empty! The rest of the OCRed text looks also good.
The idea is not mine, but i have to use this kind of separation.
Is there something i can do to improve recognition of this sepcial text ?
Nicolas Nickisch