Remove the thin horizontal line

76 views
Skip to first unread message

Sundar Andaperumal

unread,
Sep 6, 2024, 5:08:35 AM9/6/24
to tesseract-ocr
Hi, 

 I am trying to remove the thin horizontal line; when doing so the text in the SUBTOTAL 
gets disturbed and gives special characters like this:  (`°`, `—`, `~`, `*`, etc.)

 How to ignore / remove this horizontal line and extract the proper text in the SUBTOTAL section. Image attached.

thanks!
new.jpg

Zdenko Podobny

unread,
Sep 6, 2024, 12:35:21 PM9/6/24
to tesser...@googlegroups.com

pi 6. 9. 2024 o 11:08 Sundar Andaperumal <sunda...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6ca4d72e-6dac-4db9-8d25-abbe20e5ffd3n%40googlegroups.com.

Tom Morris

unread,
Sep 17, 2024, 11:15:52 AM9/17/24
to tesseract-ocr
The "mosquito noise" compression artifacts around all the sharp edges are going to make both line removal and OCR harder than it needs to be. If you can get image captures without that you'll have a much easier job.

Tom
Reply all
Reply to author
Forward
0 new messages