Fail to differentiate capital letter I from number 1.

142 views
Skip to first unread message

Lisa Ki

unread,
Aug 20, 2022, 2:20:56 AM8/20/22
to tesseract-ocr
Hi guys, I am trying to extract text from some simple clips and it just keeps reading capital I into number 1. Does anyone have any suggestions? 

I have only added borders to the original images as code below:

i = Image.open(ifp).convert('RGB')
colour = [255, 255, 255]
top, bottom, left, right = [150]*4
i_with_border = cv2.copyMakeBorder(np.array(i), top, bottom, left, right, cv2.BORDER_CONSTANT, value=colour)
ocr_result = pytesseract.image_to_string(i_with_border)

results:
101.

102.

103.

104.

105.

106.

107.

108.

109.

110.

I'11.

112.

bordered_i09.jpg
bordered_i08.jpg
bordered_i11.jpg
bordered_i02.jpg
bordered_i03.jpg
bordered_i01.jpg
bordered_i06.jpg
bordered_i12.jpg
bordered_i07.jpg
bordered_i10.jpg
bordered_i05.jpg
bordered_i04.jpg

Shahin Majazi

unread,
Aug 22, 2022, 10:03:39 AM8/22/22
to tesser...@googlegroups.com
By using the following preprocessing methods, the output of tesseract will be better in the {I 02, I 03, I 04, I 05, I 06, I 08, I 10, I 11, I 12}, but not in other images.


1. Grayscale Image: img_gray = cv2.imread(img_path, 0)
2. Erosion: img_eroded = cv2.erode(img_gray, np.ones((4, 4), np.uint8), iteration=1)
3. Rescaling: rescaling_img = cv2.resize(img_eroded, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC)

‫‪Lisa Ki‬‏ <‪1123581321...@gmail.com‬‏> در تاریخ شنبه ۲۰ اوت ۲۰۲۲ ساعت ۱۰:۵۰ نوشت:‬
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/59710fba-c1f8-43b7-ba93-7ad84f9318f2n%40googlegroups.com.

Anna Ravalimanana

unread,
Sep 5, 2022, 2:11:15 PM9/5/22
to tesser...@googlegroups.com
hey guys, I need to do ocr on car plate but I need to make tilt correction first, can you suggest me the best way to do it? I also want to train a model to do so but I can't find any code about that ? any suggestions?

Reply all
Reply to author
Forward
0 new messages