2009/11/26 Debayan Banerjee <deba...@gmail.com>:
Infact it was a single line change. Here is the patch. The change is
> Wait for my next post where I will analyse how to solve the
> Indic-dictionary bug.
in dict/permute.cpp
--- tesseract-2.04/dict/permute.cpp 2008-11-14 23:07:17.000000000 +0530
+++ tessmod/dict/permute.cpp 2009-11-26 00:34:50.660737699 +0530
@@ -1077,6 +1077,7 @@
return (NULL);
if (permute_only_top)
return result_1;
+ any_alpha=1;
if (any_alpha && array_count (char_choices) <= MAX_WERD_LENGTH) {
result_2 = permute_words (char_choices, rating_limit);
if (class_probability (result_1) < class_probability (result_2)
For non-eng script the if condition was never getting satisfied and
hence the DAWG files were not being scanned properly. Adding a
any_alpha=1 on the top explicitly on the top solves this problem for
the time. There is probably a more elegant solution though.
By the way, I do not see this particular if condition in the trunk
anywhere in the file. Perhaps the deveopers have fixed it in the trunk
already.
--
Regards,
Debayan Banerjee
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to tesseract-oc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.
Yes. The latest tesseract-indic download supports dictionary lookup.
--
Regards,
Debayan Banerjee