Have problem with parsing bold white fonts

80 views
Skip to first unread message

Евгений Захаров

unread,
Aug 4, 2020, 7:07:47 AM8/4/20
to tesseract-ocr
Hi. It wold be great if somebody could help with parsing these bold white fonts.

I try to parse this images with some preprocessing. For example after "erode", "GaussianBlur" and "Canny" some part of text start to recognise, but not all. And with different settings of preprocessing start to recognise different part of images.
Does it exist any image preprocessing algorithm to detect such text? Or it possible only to retrain model to detect this bold font?

Untitled.png

Untitled2.png

Untitled3.png


Zdenko Podobny

unread,
Aug 4, 2020, 1:18:54 PM8/4/20
to tesser...@googlegroups.com
First you need to remove the background.

ut 4. 8. 2020 o 13:07 Евгений Захаров <evgzak...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/51c6f97a-d541-4f06-bf4f-ad225723f761o%40googlegroups.com.

evgzak...@gmail.com

unread,
Aug 5, 2020, 3:34:32 AM8/5/20
to tesseract-ocr
Thanks, I will try it.

Evgeny

вторник, 4 августа 2020 г. в 20:18:54 UTC+3, zdenop:
Reply all
Reply to author
Forward
0 new messages