tesseract and pytesseract unable to detect numbers on web images

83 views
Skip to first unread message

samuele tonon

unread,
Dec 2, 2024, 10:02:58 AM12/2/24
to tesseract-ocr
Hello,
i am trying to detect digits from some images that are made out from screenshot of a web application.  Since numbers are clear i was expecting to see a correct reading but that is not the case.

Images are taken from 2048 the game, i take the grid, convert to dark background and white text. 

$ tesseract pil_check_1733137499.png  stdout -c preserve_interword_spaces=1 -l snum --psm 6 digits

gives no output.

$ tesseract pil_check_1733137501.png stdout -c preserve_interword_spaces=1 -l snum --psm 6 digits

gives no output.

$ tesseract pil_check_1733137511.png   stdout -c preserve_interword_spaces=1 -l snum --psm 6 digits    

gives:                    

8                   2

2        2        2       16

32       4        2

in this case output is good, I am missing the first line but that is something i can live with. however I don't know how to further improve and any help would be appreciated. 


pil_check_1733137499.png
pil_check_1733137501.png
pil_check_1733137511.png

Ger Hobbelt

unread,
Dec 2, 2024, 2:48:11 PM12/2/24
to tesser...@googlegroups.com
Please try again after inverting the images, i.e. black text on white background.

See also:


Met vriendelijke groeten / Best regards,

Ger Hobbelt

--------------------------------------------------
web:    http://www.hobbelt.com/
        http://www.hebbut.net/
mail:   g...@hobbelt.com
mobile: +31-6-11 120 978
--------------------------------------------------


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/126d22e2-7ed9-4ab8-8ce3-11af03933a47n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages