Hi, I am working on a project that requires OCR. I have not used Tesseract much before, aside from using it on some basic examples using the command line tool. My goal is to use OCR on insurance cards to get all of the characters and then find certain information such as the ID of the cardholder from the output. In this, accuracy is critical, as a single misread character messes up the entire ID.
My concern stems from this need for extreme accuracy, which from
this discussion thread, appears would only be possible by running the character recognition on each individual character on the card. The following quote is where I draw most of my worries from:
But if accuracy is critical in your app, in the long run I would absolutely avoid using any parts of Tesseract except char classifier. I.e. crop every single char out of your source image and run Tess in the single char PSM. I think it's should be easy as long as location of every character is quite stable among your source images. ImageMagick/shell scripts would suffice.
However, the images I will be processing differ vastly in layout - not stable like the example I linked to. Some examples of how the format may differ follow:
I have run Tesseract on samples and while it works for most of the characters, there will be cases where it misreads a single character (such as registering an "H " when the character is a "W") or even worse an entire phrase(such as registering "No New Rum" when the phrase is actually "No Referral Required"). Because of errors like this, I would not be able to use the output that Tesseract currently gives me.
Is there a realistic way to use Tesseract for this kind of endeavor?
Thanks for taking the time to read,
Scott