Re: [tesseract-ocr] Re: How to increase the accuracy of Tesseract OCR
467 views
Skip to first unread message
Message has been deleted
Message has been deleted
Art W Rhyno
unread,
Aug 26, 2014, 11:44:40 AM8/26/14
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesser...@googlegroups.com
> Kindly advise the solution to make
the image readable by Tesseract OCR.
Hi,
Since you already have leptonica with
tesseract, you might have some luck by going through its line removal example
[1]. I converted your sample image to a grayscale and ran the lineremoval
code, and then put the results through tesseract. It could be refined more
but it might be useful as a starting point.
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesser...@googlegroups.com
> Is it possible to do the same line
removal in leptonica using Java?
Hi,
The lineremoval code is located in the
"prog" subdirectory of the leptonica distribution. You could
copy "lineremoval.c" into a separate subdirectory and use it
as a starting point. I don't know how hard it would be to implement it
in java but you could probably use the Java Native Interface to call the
code directly from a java program. The tesseract-android-tools [1] project
might have some building blocks for java integration, alas, I have not
done much with tesseract/leptonica in java. This project [2] also mentions
a Java API for accessing natively-compiled Tesseract and Leptonica APIs
but I haven't looked into it. The lineremoval source code documents each
step, it is not always effective for all types of lines but I think it
would work well for the images you are dealing with.