Hi,
I am trying to extract each word from a .png image (converted from pdf documents).
Using Python 2.7 and tesseract-3.05 APIs.
But for few of the documents instead of drawing the bounding box around a word Tesseract is drawing the same for a larger area and missing some of the words.
I am using 1111.png as input. Also find the output in 1111_op.png.
Could you please help me out to understand what could be the reason.
Regards,
Mrinmoy