certainty and rating of tesseract's result

221 views
Skip to first unread message

Ryosuke Sasaki

unread,
May 27, 2015, 9:42:38 AM5/27/15
to tesser...@googlegroups.com
I used tesseract-ocr to recognize the character of images for my own application.
I would like to know how to calculate the confidence, certainty and rating of recognition result in this library.

Does anybody know how to do that in tesseract or the theory of confidence calculation?

Ray Smith

unread,
May 27, 2015, 3:43:03 PM5/27/15
to tesser...@googlegroups.com
Character:
Compute a distance measure between 0 and 1 of the character from a training sample.

certainty  = -20 * distance
rating = distance * outline length

Word:
certainty = min over word of character certainty.
rating = sum over word of character rating.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/300dbb8c-f43c-47f7-a754-0c001604b447%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ryosuke Sasaki

unread,
May 28, 2015, 8:14:31 AM5/28/15
to tesser...@googlegroups.com
Thanks a lot.
I see the formulation of certainty and rating.
I read the technical documentation about tesseract. 
i thought the distance by calculating features vector extracted from unknown character and prototypes of data set, right?
So, please let me know where that calculation are implemented in tesseract sources code. 
Best regards

2015年5月28日木曜日 4時43分03秒 UTC+9 Ray Smith:
Reply all
Reply to author
Forward
0 new messages