Border Noise Removal

76 views
Skip to first unread message

Everest

unread,
Jun 16, 2015, 4:41:03 PM6/16/15
to ocr...@googlegroups.com
Hello I am working on a project dealing with document image. What I want to handle now is to remove the border noise from a whole scanned colored image. 'Cause I didn't get a document about this project, could anyone provide me with a explanation about how to apply a  method to get the expected text area from a original image or binarized image? Thank you very much!

Sachin Garg

unread,
Jun 16, 2015, 6:03:00 PM6/16/15
to ocr...@googlegroups.com
A very good intro is here:

http://www.danvk.org/2015/01/07/finding-blocks-of-text-in-an-image-using-python-opencv-and-numpy.html

However, in my case the "-n" option to "ocropus-nlbin" (as in
http://www.danvk.org/2015/01/09/extracting-text-from-an-image-using-ocropus.html)
did not work.

I have a similar problem: vertical text around the borders that needs to
be ignored ...
> --
> You received this message because you are subscribed to the Google
> Groups "ocropus" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to ocropus+u...@googlegroups.com
> <mailto:ocropus+u...@googlegroups.com>.
> To post to this group, send email to ocr...@googlegroups.com
> <mailto:ocr...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ocropus/bd7edf2c-82f5-4060-93fe-39fb4f86262c%40googlegroups.com
> <https://groups.google.com/d/msgid/ocropus/bd7edf2c-82f5-4060-93fe-39fb4f86262c%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

--
Sachin Garg <sga...@masonlive.gmu.edu>
Doctoral Student
School of Policy, Government & Intl. Affairs
George Mason University
3351 Fairfax Drive MS3B1, Arlington, VA 22201, USA
Phone: +1-703-993-3787 Cell: +1-571-222-3216

Everest

unread,
Jun 17, 2015, 4:45:51 PM6/17/15
to ocr...@googlegroups.com, sga...@masonlive.gmu.edu

Thank you very much, Sachin! I will try that, what I want to do is to dewarp the text in pic like this. Currently I got no good result with the border noise, since my algorithm isn't sensitive enough to it. Do you think I could either physically detect and cut off the border noise or use a filter pipe to get ride of this edge noise content. 

and that's what I got from you ocrpus-nlbin -n function. 


Any suggestion would be appreciated. Thank you very much!

 


在 2015年6月16日星期二 UTC-4下午6:03:00,Sachin Garg写道:

Zura Isakadze

unread,
Jun 20, 2015, 9:30:22 AM6/20/15
to ocr...@googlegroups.com
Hi Everest, I ended up using ScanTailor for this job, plus I noticed that in my case without using any options, ScanTailor performs better binarization. 

Everest

unread,
Jun 22, 2015, 3:15:21 PM6/22/15
to ocr...@googlegroups.com
Hi Zura, I got a lot of pics for this process, and I want it to be automatic. Do you know there is some software that can handle this? Thank you very much!

在 2015年6月20日星期六 UTC-4上午9:30:22,Zura Isakadze写道:
Reply all
Reply to author
Forward
0 new messages