That's a very difficult task to do, and I don't think it can be done with a code. Instead, you can try removing the background using imagemagick. It increases the transparency of the background. For more information: http://www.imagemagick.org/discourse-server/viewtopic.php?t=24959
Regards
Lakshay
Is there a way to remove the diagonal letters from this page? I'm trying to send it through tesseract but I'm getting a bunch of garbage where the watermark is meeting the letters in the image. I used GhostScript to convert the original PDF image into the PNG file.Thanks in advance for your help!
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e275c83b-0d38-4d0a-9dc6-e514d46b1216%40googlegroups.com.