Improve OCR Accuracy

248 views
Skip to first unread message

Hamzeh abu-ajameia

unread,
Mar 26, 2021, 9:46:52 AM3/26/21
to tesseract-ocr
Hi Everyone,

I am a newbie to OCR, I've been trying to reach a 99% accuracy but i didn't manage to, would  you please give some tips to improve the results, below is a amole of used codes, it keeps skip some numbers and sometime predict wrong numbers.

Code :
import cv2 
import pytesseract 

  # Load the image img = cv2.imread("FD2sX.png")

  # Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)

  # Convert to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # OCR

print(pytesseract.image_to_string(gry))

black.png

The results

24,026 56.5% 5,798 13.6% 4,236 10.0% 3,546 8.3% 4,905 11.5% 42,511 100.0% 20,968 66.0% 9,279 29.2% 1,328 4.2% 145 0.5% 47 0.1% 31,767 100.0% 5,854 77.8% 1,617 21.5% 45 0.6% 0.1% 7,523 100.0% 50,848 62.2% 16,694 20.4% 5,564 3,736 4.6% 4,959 6.1% 81,801 100.0% 24,826 78.1% 5,866 18.5% 11 0.0% 1.6% 562 1.8% 31,773 100.0% 26,499 81.1% 5,940 18.2% 55 0.2% 167 0.5% 32,661 100.0% 6,008 87.2% 839 12.2% 0.6% 0.0% 6,886 100.0% 57,333 80.4% 12,645 17.7% 0.1% 713 1.0% 0.8% 71,320 100.0%

Thanks in advance.



  
 

Lorenzo Bolzani

unread,
Mar 26, 2021, 10:09:00 AM3/26/21
to tesser...@googlegroups.com
Hi Hamzeh,
next time please explain exactly where the problem is so that people here do not have to manually check all the numbers to spot the mistakes.

Try to threshold the image and upscale it more, I would start with four times.


Bye

Lorenzo


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d1c54d14-0eb4-49e4-8483-a1bb4c2d8748n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages