Unable to detect the text from the Segmented Image

95 views
Skip to first unread message

Sheeban Sadiq

unread,
Mar 14, 2017, 3:26:52 AM3/14/17
to tesseract-ocr
Hello,


I've an image which was de-skewed (Rotated) and then segmented in the form of lines.
From Segmentation I have got the image '1234567.png' by using Gimp i cropped it to 'one.jpg'
I've been trying to extract the text from the segmented image and its not being extracted through it :(

I've attached the segmented image.

I've used the tesseract ocr code as shown below to extract text:

"""
from PIL import Image
from pytesser import *
 
image_file = 'one.jpg'
im = Image.open(image_file)
text = image_to_string(im)
text = image_file_to_string(image_file)
text = image_file_to_string(image_file, graceful_errors=True)
print "=====output=======\n"
print text
"""

Please do help.

Thanks in Advance :)
one.jpg
1234567.png

Martin Fadrhons

unread,
Mar 14, 2017, 7:37:48 AM3/14/17
to tesseract-ocr
If that is the original quality of image used for ocr, the quality is really poor. Also Tesseract OCR is not designed to recognize handwritten text. Improving quality of image should work at least for the printed part of text.

Hope it helps,
Martin

Dne úterý 14. března 2017 8:26:52 UTC+1 Sheeban Sadiq napsal(a):
Reply all
Reply to author
Forward
0 new messages