Training Metrics

126 views
Skip to first unread message

Simon

unread,
Nov 22, 2023, 8:50:45 AM11/22/23
to tesseract-ocr
As I am training my model I got in contact with the following metrics:

E.g.:
At iteration 6345/6500/6500, Mean rms=6.246%, delta=7.139%, char train=68.07%, word train=92.2%, skip ratio=0%,  New best char error = 68.07 wrote checkpoint.

Unfortunately I don't find any proper and detailed description or explanation of these metrics on the web. 

To evaluate the metrics this information would be really helpful, as right now It feels more like guessing what values are "good". As most developers are lacking in experience it is pretty hard to tell what values are "good" or "bad".


Message has been deleted

Des Bw

unread,
Nov 22, 2023, 9:34:52 AM11/22/23
to tesseract-ocr
The character rate is the most common measure of the quality of your training. 
- train with large data. Run it on a couple of epochs; so that your CER will be as close as 0.01. That is the most common strategy. 

Message has been deleted

Simon

unread,
Nov 23, 2023, 4:28:35 AM11/23/23
to tesseract-ocr
Alright, 

this might be a litte bit of a dump question but where exactly can I see the CER?

2 Percent improvement time=56, best error was 12.49 @ 8294
At iteration 8350/10000/10000, Mean rms=2.701%, delta=2.491%, char train=10.385%, word train=24.4%, skip ratio=0%,  New best char error = 10.385 wrote best model:data/Common_num/checkpoints/Common_num10.385_8350.checkpoint wrote checkpoint.

Is it the "best char error"? Where do I have to look to find CER? Is the CER in the above example?

Also what are signs that my model is overfitted? Is there any possibility recognicing this in the above statement?

Des Bw

unread,
Nov 23, 2023, 4:34:40 AM11/23/23
to tesseract-ocr
I think they are abbreviations: 
best char error =BCER
character error = CER

There is no signs to tell if the model is overfit. I know no diagnostics for that. For fine-tuning, running iterations higher than 400 is always problematic because it destroys the base model. 

- So, the common strategy is to increase your data; and run just 300 iterations. The BCER is not that important in that case. 
But, for training from scratch or from layer (network), you should try to get the BCER (error rate) as low as possible. Overfitting happens when the data is too small, and the iterations are too many. From my experience, running 2-5 epochs seems to generate good results. But, I have seen experienced guys training for hundred even thousands of epochs. 

Des Bw

unread,
Nov 23, 2023, 4:36:06 AM11/23/23
to tesseract-ocr
BCER (best character rate) is automatically picked by tesseract from all list of  character rates errors (CER). 
Reply all
Reply to author
Forward
0 new messages