Recognition of Polish words. Models with Polish characters or Training Ocropus 0.7 for Polish

153 views
Skip to first unread message

inez....@gmail.com

unread,
Nov 4, 2014, 8:24:27 AM11/4/14
to ocr...@googlegroups.com
I'm intend to train the Ocropus 0.7 version for Polish language. If I instead create new models, expand the default of Polish expressions? The creation of new models of well recognized Polish learned words but wrong about the words without Polish characters.

Adnan ul Hasan

unread,
Nov 5, 2014, 5:36:43 PM11/5/14
to ocr...@googlegroups.com
To create new models for Polish, you need to have transcribed text-lines (images + GT) either real or artificially generated. You can add polish characters to chars.py in ocrolib and then call them instead of default characters in ocropus-rtrain. Or, you can used -c option (use with caution, there is a little bug in the ocropus-rtrain code) to create codec from your GT data.

inez....@gmail.com

unread,
Nov 6, 2014, 9:36:47 AM11/6/14
to ocr...@googlegroups.com
charts.py added in Polish characters, and previously was creating models using ocropus-rtrain. I would try to run trenein with the -c option, only what is given as an argument? because nowhere in the documentation I could not find it

inez....@gmail.com

unread,
Nov 17, 2014, 8:05:07 AM11/17/14
to ocr...@googlegroups.com
I would like to create a codec using the command ocropus-train with the parameter -c just do not know what is given as an argument to this parameter

Adnan ul Hasan

unread,
Nov 18, 2014, 10:58:10 AM11/18/14
to ocr...@googlegroups.com
As I mentioned earlier, there is a small bug in ocropus-rtrain. It will be hopefully be fixed in a week or so; after that you can just use -c option and it will generate the codec from GT data. For the time being, you can use the ocropus-rtrain code at https://github.com/adnanulhasan/ocropy.

inez....@gmail.com

unread,
Nov 20, 2014, 8:21:02 AM11/20/14
to ocr...@googlegroups.com
thanks for including a link to your source version ocropus. see how it will work for me using scripts corrected by you. I also did some corrections in order to work properly on the remote machine.
Reply all
Reply to author
Forward
0 new messages