Hi everyone,
I am a new user to tesseract-ocr and had been using it on python with pytesseract wrapper.
On the pytesseract, I am able to call to function 1) image_to_string which translate character it recognize to text string in a python list and 2) image_to_data which translate character to string, + verbose information where it includes all the bounding boxes coordinates and confidence of the prediction.
I had used these 2 function and would expect them to actually return the same result but they differ a lot. I was thinking maybe image_to_data uses -psm 0 by hard default and this parameters cannot be change. Where as in image_to_string, I could set -psm 6 which return fairly reasonable results.
Cheers,
Alan