Re: Impact of bigram-dawg

408 views
Skip to first unread message

Nick White

unread,
Jun 12, 2013, 12:19:01 PM6/12/13
to tesser...@googlegroups.com
Hi Florian.

The only training included with Tesseract that uses bigram dawg
files is eng, and the eng.config only has these 2 relevant entries:

load_bigram_dawg True
tessedit_enable_bigram_correction True

So I guess you're providing Tesseract all it needs. You could look
for the "verified by bigram model" reference in the code and see if
there is any useful nearby code that is controlled by a config
variable.

I don't think many people use bigram dawgs, so it's possible it
isn't very mature / useful. The cube code seems to be lots more
bigram focused, but we can't train for that yet...

Good luck, and let us know how you get on.

Nick

Florian K.

unread,
Jun 14, 2013, 3:38:00 AM6/14/13
to tesser...@googlegroups.com
Hi Nick,
thanks for answering.
I had a quick look at the code-base, there seems to be no "enabling" variable, from the code base it seems to do what it should do. Maybe there are still some bugs in this part of the code. 
I'm planning to try the english model with and without the bigram-dawg and evaluate if it works here.

Best regards,

Florian 
Reply all
Reply to author
Forward
0 new messages