Finetuning in ocrd-train

186 views
Skip to first unread message

ameera...@gmail.com

unread,
Mar 10, 2019, 12:53:52 AM3/10/19
to tesseract-ocr

In the ocrd-train Makefile, here is the code for finetuning


ifdef START_MODEL
$(LAST_CHECKPOINT): unicharset lists $(PROTO_MODEL)
mkdir -p data/checkpoints
lstmtraining 
--traineddata $(PROTO_MODEL) 
--old_traineddata $(TESSDATA)/$(START_MODEL).traineddata 
--continue_from data/$(START_MODEL)/$(START_MODEL).lstm 
--net_spec "[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1chead -n1 data/unicharset]" 
--model_output data/checkpoints/$(MODEL_NAME) 
--learning_rate 20e-4 
--train_listfile data/list.train 
--eval_listfile data/list.eval 
--max_iterations 10000


Why do we need the following line? I thought it was only used in training from scratch.
--net_spec "[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1chead -n1 data/unicharset]" \


Should the learning rate be set lower for fine-tuning? The learning rate for training from scratch is 20e-4, so it would seem that the learning rate for fine-tuning should be significantly lower?
--learning_rate 20e-4 \

Shree Devi Kumar

unread,
Mar 10, 2019, 3:40:47 AM3/10/19
to tesser...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d1c9058e-3885-4ae7-9602-cf9033ccd87f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

ameera...@gmail.com

unread,
Mar 10, 2019, 6:56:34 PM3/10/19
to tesseract-ocr
Thanks Shree!

Jens Humrich

unread,
Mar 21, 2019, 5:49:15 AM3/21/19
to tesseract-ocr
The parameter can be left out of the command. It does not appear to change the result.
Reply all
Reply to author
Forward
0 new messages