You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
On Saturday, March 5, 2016 at 5:11:55 AM UTC-5, Gunasekaran Velu wrote:
>tesseract.exe Underline.png Underline -l eng -psm 1
Result: This is underline word @
Does it possible to do OCR recognition for underlined text/word on the image? or some image processing need to apply on the image?
Attached sample image.
Tesseract knows how to recognize underlined text, as you can see from that fact that it got "underline" correct in your example. For some reason it's getting confused by the underlined word "test", perhaps because it's at the end of the line?
It could potentially represent a bug, but I'd try to recreate it with a less artificial example. Of course, pre-processing would improve the situation and removing underlines should be that hard to do.
Tom
Gunasekaran Velu
unread,
Mar 6, 2016, 7:38:03 PM3/6/16
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
HI
I just sent own creation f image in paint and sent you.
Now i have attached the real document(Cropping from full image due to confidential data) underline text.
In this case when i do the OCR the underline text completely skipped by tesseract.
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Hi Tom
Any update regarding underline text problem?
Regards
Guna
Gunasekaran Velu
unread,
Apr 16, 2016, 5:06:12 AM4/16/16
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Hi Tom
Does it possible to use config variable for underline text image?
Looking forward it.
Regards
Guna
On Monday, March 7, 2016 at 6:08:03 AM UTC+5:30, Gunasekaran Velu wrote:
Tom Morris
unread,
Apr 16, 2016, 1:56:37 PM4/16/16
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesser...@googlegroups.com
There's a critical word missing from what I wrote and perhaps my English is a little ambiguous too, so let me try again:
It could potentially represent a bug, but, if I were you, I'd try to recreate it with a less artificial example and if you confirm that it's a real bug, file a bug report with all the details of your findings so that one of the developers can look at it. Of course, pre-processing would improve the situation and removing underlines should not be that hard to do.
The most direct route to success, in my opinion, is going to be pre-processing to remove the underlines. When you're working on this and testing the results, you should make sure that you work on representative images, not little tiny fragments of a few words. When Tesseract has normal page boundaries, multiple lines of text, etc, it has much more information available to it to figure out font size, line spacing, etc.
If you need help in figuring out how to do the line removal, there are tutorials available on the web, but any recipe is going to need tuning and experimentation to work best with your particular application.
If you've got additional question, feel free to address them to the list rather than me personally. I wasn't offering to help you debug this for free or to write the application for you.
Tom
Gunasekaran Velu
unread,
Apr 19, 2016, 9:29:55 PM4/19/16
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Thanks Tom.
Regards
Guna
Felix Bolivar
unread,
Jul 13, 2016, 2:35:08 PM7/13/16
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message