Update Dictionary(ENGLISH) and Finding confidence level of recognized words

15 views
Skip to first unread message

Raghu R K

unread,
May 14, 2007, 9:06:47 AM5/14/07
to tesseract-ocr
Hi,

I am working on tesseract-1.02 ocr,now i want to update the
dictionary(ENGLISH) and also want to know the confidence level of
recognized words of OCR can anybody could help me out on this.

waiting for the replies..

Best Regards,
Raghu R K

Scan...@gmail.com

unread,
May 14, 2007, 3:21:26 PM5/14/07
to tesseract-ocr
I posted a DLL which returns a structure thant containts the
confidence.

This data came from the OCR appeand char.

The DLL will be integrated into the next version I am told.

Raghu R K

unread,
May 15, 2007, 10:11:28 AM5/15/07
to tesseract-ocr
Hi,
Thanks for the reply,

can u tell me where can i get the DLL so that i will integrate it into
my code and test it.

Best Regards,
Raghu.R.K.

> > Raghu R K- Hide quoted text -
>
> - Show quoted text -

Scan...@gmail.com

unread,
May 15, 2007, 10:33:30 AM5/15/07
to tesseract-ocr
www.scanhelp.com/pfile/tessdll.zip

Look at the structure that is returned. It gives a confidence value.

Raghu R K

unread,
May 16, 2007, 1:51:43 AM5/16/07
to tesseract-ocr
Hi,
Thanks a lot,

Along with confidence of words I am working on word boundaries for
extracting the features(like area,length,height..etc) of each word in
the image ,so can i get the dll code so that i will modify(enhance)
appropriately and get the reults what i wanted.

Best Regards,
Raghu R K

> > > - Show quoted text -- Hide quoted text -

sriranga

unread,
May 17, 2007, 12:01:41 PM5/17/07
to tesseract-ocr
Hi,
How to integrate the DLL into tesseract-1.03 - which have already
installed in Ubuntu 7.04
Regards,

On May 15, 12:21 am, "g...@jetsoftdev.com" <ScanH...@gmail.com> wrote:

Scan...@gmail.com

unread,
May 21, 2007, 4:20:39 PM5/21/07
to tesseract-ocr
The dll code is in 1.04 but it is windows only. I hope to do a shared
library version as well.

Raghu R K

unread,
May 23, 2007, 6:21:54 AM5/23/07
to tesseract-ocr
Hi,

Thanks a lot for the release V1.04.,
I am using the TessDll for windows and got good results in terms of
word recognization but for some words the confidence values returned
in the structure EANYCODE_CHAR is more than 100,so can u tell
what is the range of confidence values, Is it between (0 -100)? or
more than that.


Thanks in advance.

Best Regards,
Raghu R K

On May 15, 7:33 pm, "g...@jetsoftdev.com" <ScanH...@gmail.com> wrote:

> > > - Show quoted text -- Hide quoted text -

Reply all
Reply to author
Forward
0 new messages