Hello,
I am using tesseract 3.02 & leptonica-1.70 on RHEL 6.5 64-bit OS.
Just to get a hang of the OCR parsing, i tried against the attached images to get the text information.
The attached document has both teh input image and output text of the OCR.
Can someone please tell me if this is the expected text recognition quality or if there are any chances of improving the quality of the result.
Did i miss something?
We planned to use tesseract to parse scanned copy of official documents and persist the textual data out of it.
I am not sure if i am doing it right.
Kindly suggest me directions in improving the quality.
Regards
Sridhar