Barcode in the page read as gibberish text - need workaround

41 views
Skip to first unread message

Gaurav Shegokar

unread,
Mar 30, 2021, 2:44:06 PM3/30/21
to tesseract-ocr
I have pdfs where there are barcodes on almost every page, the issue with the current tesseract is that it reads the barcode as gibberish text. 

How do I get around with that? Is it the issue with the line/word segmentation, Or lstm OCR? Or I have to employ some cv tricks to remove barcodes before processing it through tesseract. 

Thank you for your help.

Best,
Gaurav.
Reply all
Reply to author
Forward
0 new messages