Help in improving the OCR Result quality.

96 views
Skip to first unread message

sri...@dashout.in

unread,
Jun 11, 2014, 11:36:03 PM6/11/14
to tesser...@googlegroups.com


Hello,

I am using tesseract 3.02 & leptonica-1.70 on RHEL 6.5 64-bit OS.
Just to get a hang of the OCR parsing, i tried against the attached images to get the text information.
The attached document has both teh input image and output text of the OCR.

Can someone please tell me if this is the expected text recognition quality or if there are any chances of improving the quality of the result.
Did i miss something?

We planned to use tesseract to parse scanned copy of official documents and persist the textual data out of it.
I am not sure if i am doing it right.

Kindly suggest me directions  in improving the quality.

Regards
Sridhar


ocr_test.docx

zdenko podobny

unread,
Jun 12, 2014, 2:28:24 AM6/12/14
to tesser...@googlegroups.com
  1. Don't send images and output in MS Word document
  2. Did you check project wiki's?

Zdenko


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6d2eb993-80d2-4b67-8093-edc2728aa596%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages