Barcode in the page read as gibberish text - need workaround

41 views

Skip to first unread message

Gaurav Shegokar

unread,

Mar 30, 2021, 2:44:06 PM3/30/21

to tesseract-ocr

I have pdfs where there are barcodes on almost every page, the issue with the current tesseract is that it reads the barcode as gibberish text.

How do I get around with that? Is it the issue with the line/word segmentation, Or lstm OCR? Or I have to employ some cv tricks to remove barcodes before processing it through tesseract.

Thank you for your help.

Best,

Gaurav.

Reply all

Reply to author

Forward

0 new messages