Percentage of accuracy

54 views
Skip to first unread message

Mox Betex

unread,
Jun 29, 2019, 9:53:05 AM6/29/19
to tesseract-ocr
Is it possible to get percentage of accuracy of recognized text?

I need to recognize multiple languages (2 languages) and tesseract doesn't know exactly what language is when I put parametar -l lang1+lang2.
What I want to do is to scan with both languages separately, but I would need some percentage of accuracy to determine probability of language.

Quan Nguyen

unread,
Jun 29, 2019, 11:31:32 AM6/29/19
to tesseract-ocr
It's called "confidence" value in Tesseract terminology. hocr format output contains confidency values, at word level, I believe.

Mox Betex

unread,
Jun 29, 2019, 1:00:45 PM6/29/19
to tesseract-ocr
I have found w_conf attribute in .hocr file.
How should I interpret that value? Does high w_conf value means high accuracy?

Quan Nguyen

unread,
Jun 29, 2019, 1:36:33 PM6/29/19
to tesseract-ocr
Yes, its values range from 0 to 100.
Reply all
Reply to author
Forward
0 new messages