Fail to differentiate capital letter I from number 1.

Lisa Ki

unread,

Aug 20, 2022, 2:20:56 AM8/20/22

to tesseract-ocr

Hi guys, I am trying to extract text from some simple clips and it just keeps reading capital I into number 1. Does anyone have any suggestions?

I have only added borders to the original images as code below:

i = Image.open(ifp).convert('RGB')

colour = [255, 255, 255]

top, bottom, left, right = [150]*4

i_with_border = cv2.copyMakeBorder(np.array(i), top, bottom, left, right, cv2.BORDER_CONSTANT, value=colour)

ocr_result = pytesseract.image_to_string(i_with_border)

results:

101.

102.

103.

104.

105.

106.

107.

108.

109.

110.

I'11.

112.

bordered_i09.jpg

bordered_i08.jpg

bordered_i11.jpg

bordered_i02.jpg

bordered_i03.jpg

bordered_i01.jpg

bordered_i06.jpg

bordered_i12.jpg

bordered_i07.jpg

bordered_i10.jpg

bordered_i05.jpg

bordered_i04.jpg

Shahin Majazi

unread,

Aug 22, 2022, 10:03:39 AM8/22/22

to tesser...@googlegroups.com

By using the following preprocessing methods, the output of tesseract will be better in the {I 02, I 03, I 04, I 05, I 06, I 08, I 10, I 11, I 12}, but not in other images.

1. Grayscale Image: img_gray = cv2.imread(img_path, 0)

2. Erosion: img_eroded = cv2.erode(img_gray, np.ones((4, 4), np.uint8), iteration=1)

3. Rescaling: rescaling_img = cv2.resize(img_eroded, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC)

‫‪Lisa Ki‬‏ <‪1123581321...@gmail.com‬‏> در تاریخ شنبه ۲۰ اوت ۲۰۲۲ ساعت ۱۰:۵۰ نوشت:‬

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/59710fba-c1f8-43b7-ba93-7ad84f9318f2n%40googlegroups.com.

Anna Ravalimanana

unread,

Sep 5, 2022, 2:11:15 PM9/5/22

to tesser...@googlegroups.com

hey guys, I need to do ocr on car plate but I need to make tilt correction first, can you suggest me the best way to do it? I also want to train a model to do so but I can't find any code about that ? any suggestions?

Reply all

Reply to author

Forward