Issues with OCR recogniztion

93 views
Skip to first unread message

Daniele Consoli

unread,
Sep 10, 2022, 6:09:53 AM9/10/22
to tesseract-ocr
Greetings, 
I'm new to Tesseract, but it seems like I'm having some fundemental issues and couldn't find any solutions by looking at guides online, so I figured I'd ask here.

I am trying to scan a few pages of a pdf, but the results are typically very inaccurate (see pictures below). 
5.png
unknown.png
I installed this version of Tesseract (https://github.com/UB-Mannheim/tesseract/wiki) for Windows.

The command I used to get those results is "tesseract --oem 3 5.png 5"

Any help would be greatly appreciated!

Daniele Consoli

unread,
Sep 10, 2022, 7:01:24 AM9/10/22
to tesseract-ocr
Update, I have fixed this myself by using an app to turn all pictures black and white.
I think the issue might have been with the binarisation of the images that tesseract was doing.
I'm getting much better results now!

Zdenko Podobny

unread,
Sep 10, 2022, 7:57:14 AM9/10/22
to tesser...@googlegroups.com
I would suggest reading the doc before using any tool. 

so 10. 9. 2022 o 9:01 Daniele Consoli <consoli...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7d87a070-015e-49fc-9961-de3b0fa3b0f3n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages