Tesseract

72 views
Skip to first unread message

Sayani Mondal

unread,
Sep 22, 2015, 12:00:38 PM9/22/15
to tesseract-ocr

Hello All,

    I am trying to extract characters from a PNG image using tesseract. The attached image is a screenshot of a program written in VS2012. Next, I am cropping the code editor section and saving it . I am using the tesseract from command prompt, along with the makebox parameter so as to retrieve the individual character bounding box dimension. The output which I am getting is as below.
 
#    startcolumn      startrow  endcolumn  endrow

 However, the desired output is given below.

3  startcolumn      startrow  endcolumn  endrow
#    startcolumn      startrow  endcolumn  endrow

I have tried to change the font in VS2012 and also tried by saving the screenshot in TIFF format. Still the problem persists. Tesseract is not able to detect the line numbers and all the characters correctly. Is it due to cropping of the image file reducing pixel depth? If so then how to increase it so that all characters are extracted correctly.


Reply all
Reply to author
Forward
0 new messages