Help Needed - How to best extract information from complex images (text, barcodes, symbols)

129 views
Skip to first unread message

Max Song

unread,
Aug 22, 2023, 1:45:39 AM8/22/23
to tesseract-ocr
Hello,

I have a image below that I would love to extract text information from it. However, the image is complex as it combines of text, barcodes and symbols . After image preprocessing, I found tessearct wrongly identified barcodes as text. Below are the images for before and after processing and the outputs of the tessearct. I also attach my code for the preprocessing.  I would love to hear how I can improve. Thanks.

Code:
img = cv2.resize(img, None, fx=4, fy=4, interpolation=cv2.INTER_CUBIC)
# Converting to gray scale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Removing Shadows
dilated_img = cv2.dilate(gray, np.ones((7, 7), np.uint8))
bg_img = cv2.medianBlur(dilated_img, 21)
diff_img = 255 - cv2.absdiff(gray, bg_img)
norm_img = cv2.normalize(diff_img, None, alpha=0, beta=255,
norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8UC1)

# Apply dilation and erosion to remove some noise
kernel = np.ones((1, 1), np.uint8)
# increases the white region in the image
img = cv2.dilate(norm_img, kernel, iterations=1)
# erodes away the boundaries of foreground object
img = cv2.erode(img, kernel, iterations=1)

# Apply blur to smooth out the edges
img = cv2.GaussianBlur(img, (5, 5), 0)

# Apply threshold to get image with only b&w (binarization)
img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

cv2.imwrite('./filter.png', img)

custom_config = r'--oem 3 --psm 6'
pytesseract.image_to_string(img, config=custom_config,lang='eng')

CTO-UPC.png
processed.png
Reply all
Reply to author
Forward
0 new messages