I have been recently working on extracting text from images using Google Cloud Vision API.
It's giving amazing results, however, I have been stuck at a particular step where I need to compare the extracted text and the same is of different ASCII values and hence doesn't get matched.
For eg:
for i in [77,924,1018,1052]:
print(chr(i))
The above code will display the characters that look similar to the English character 'M', however, they are all different, and hence when I try to compare, it returns False. This issue is with multiple characters.
It would be really great if any help/suggestions could be provided on how to deal with the same.
Google Vision Code for text extraction:
def detect_text(img_path,x,y,w,h):
data = []
client = vision.ImageAnnotatorClient()
im = cv2.imdecode(np.fromfile(img_path, dtype=np.uint8), cv2.IMREAD_UNCHANGED)[y:y+h, x:x+w]
_, im_buf_arr = cv2.imencode(".jpg", I'm)
content = im_buf_arr.tobytes()
image = vision.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
for text in texts:
data.append('\n"{}"'.format(text.description))
return data
if response.error.message:
raise Exception('{}\nFor more info on error messages, check: '
response.error.message))