Help in improving the OCR Result quality.

96 views

Skip to first unread message

sri...@dashout.in

unread,

Jun 11, 2014, 11:36:03 PM6/11/14

to tesser...@googlegroups.com

Hello,

I am using tesseract 3.02 & leptonica-1.70 on RHEL 6.5 64-bit OS.
Just to get a hang of the OCR parsing, i tried against the attached images to get the text information.
The attached document has both teh input image and output text of the OCR.

Can someone please tell me if this is the expected text recognition quality or if there are any chances of improving the quality of the result.
Did i miss something?

We planned to use tesseract to parse scanned copy of official documents and persist the textual data out of it.
I am not sure if i am doing it right.

Kindly suggest me directions in improving the quality.

Regards
Sridhar

ocr_test.docx

zdenko podobny

unread,

Jun 12, 2014, 2:28:24 AM6/12/14

to tesser...@googlegroups.com

Don't send images and output in MS Word document
Did you check project wiki's?

Zdenko

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6d2eb993-80d2-4b67-8093-edc2728aa596%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages