Hi,
I have 2 images pretty similar that I want to OCR.

I think they are both pretty good quality. To OCR the 2nd one I'm using this command:
tesseract image_1758836841_box0_score0_87.jpg stdout --dpi 600 --psm 7 -l eng
And I'm getting exactly what is in the picture.
However, the same command for the first picture doesn't return anything.
Now, if I change the command for this one:
tesseract image_1758836719_box0_score0_87.jpg stdout --dpi 600 -l eng
I'm getting some output with a lot of noise:
Detected 6 diacritics
— sl O
a e any aS |
Lightning Greaves
But for the Aurochs file I'm getting "Empty page!!". I have not been able to get a command working for both.
So I have a few questions here.
- Is there a way to say something like "try without PSM and if empty page try with psm 7"?
- Is that possible to provide my own list of possible words to look for? Like, can I provide "Aurochs, Greaves, Lightning" and enforce the OCR to use only those possible words?
Thanks,
JM