Does the psm value used to generate lstmf files influences the training?

14 views
Skip to first unread message

Lorenzo Bolzani

unread,
Mar 21, 2019, 6:55:44 AM3/21/19
to tesser...@googlegroups.com

Hi,
I keep having problems with duplicated letters with custom fine-tuned models.

For example an M becomes MH.

I'm using ocrd-train with actual crops and I noticed that the lstmf files are generated with psm 6.

At runtime I use psm 7. Do you think this may make a difference? From a quick test it does not seem the case.

The problem gets worse if I use psm 13 for recognition this is why I'm wondering if there is a relation.

Is there something else that I'm doing wrong that might lead to this problem? Or something I can improve?

I have only one font (ocr-b) with fixed height (44px plus 2px white margin).

According to this post the sweet spot seems to be closer to 30px (for most fonts)





Thanks, bye

Lorenzo
Reply all
Reply to author
Forward
0 new messages