Hello,
I'm using Tesseract v3.02.02.
I'm unable to get it to consistantly recognizing a number that contains a decimal point. Tesseract is recognizing the digits. Tesseract is recognizing the leading minus sign when there.
But, it is always throwing out the decimal point. Has anyone else run into this and found a solution? I'm doing this all from a C++ executable on Ubuntu.
Here are the things I tried and did not help:
1. I normally use png files. I've tried jpg and tiff but the results are the same.
2. I tried playing with contrast. I made the background black and the color of the numbers bright yellow
3. I tried setting the tessedit_char_whitelist to ".-0123456789" explicitly
4. I've also use a TesseractRect to only OCR the numbers I'm interested in.
5. I tried using GetUNLVText instead of GetUTF8Text. GetUNLVText was worse and just returned garbage.
6. The command-line tesseract command also gives me the same results; I even tried using the 'digits' configuration.
regards,m